1. 10 Nov, 2016 4 commits
  2. 28 Mar, 2016 1 commit
    • Vishwanath Pai's avatar
      netfilter: ipset: fix race condition in ipset save, swap and delete · 596cf3fe
      Vishwanath Pai authored
      This fix adds a new reference counter (ref_netlink) for the struct ip_set.
      The other reference counter (ref) can be swapped out by ip_set_swap and we
      need a separate counter to keep track of references for netlink events
      like dump. Using the same ref counter for dump causes a race condition
      which can be demonstrated by the following script:
      ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
      ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
      ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
      ipset save &
      ipset swap hash_ip3 hash_ip2
      ipset destroy hash_ip3 /* will crash the machine */
      Swap will exchange the values of ref so destroy will see ref = 0 instead of
      ref = 1. With this fix in place swap will not succeed because ipset save
      still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).
      Both delete and swap will error out if ref_netlink != 0 on the set.
      Note: The changes to *_head functions is because previously we would
      increment ref whenever we called these functions, we don't do that
      Reviewed-by: default avatarJoshua Hunt <johunt@akamai.com>
      Signed-off-by: default avatarVishwanath Pai <vpai@akamai.com>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
  3. 07 Nov, 2015 3 commits
  4. 28 Aug, 2015 1 commit
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Out of bound access in hash:net* types fixed · 6fe7ccfd
      Jozsef Kadlecsik authored
      Dave Jones reported that KASan detected out of bounds access in hash:net*
      [   23.139532] ==================================================================
      [   23.146130] BUG: KASan: out of bounds access in hash_net4_add_cidr+0x1db/0x220 at addr ffff8800d4844b58
      [   23.152937] Write of size 4 by task ipset/457
      [   23.159742] =============================================================================
      [   23.166672] BUG kmalloc-512 (Not tainted): kasan: bad access detected
      [   23.173641] -----------------------------------------------------------------------------
      [   23.194668] INFO: Allocated in hash_net_create+0x16a/0x470 age=7 cpu=1 pid=456
      [   23.201836]  __slab_alloc.constprop.66+0x554/0x620
      [   23.208994]  __kmalloc+0x2f2/0x360
      [   23.216105]  hash_net_create+0x16a/0x470
      [   23.223238]  ip_set_create+0x3e6/0x740
      [   23.230343]  nfnetlink_rcv_msg+0x599/0x640
      [   23.237454]  netlink_rcv_skb+0x14f/0x190
      [   23.244533]  nfnetlink_rcv+0x3f6/0x790
      [   23.251579]  netlink_unicast+0x272/0x390
      [   23.258573]  netlink_sendmsg+0x5a1/0xa50
      [   23.265485]  SYSC_sendto+0x1da/0x2c0
      [   23.272364]  SyS_sendto+0xe/0x10
      [   23.279168]  entry_SYSCALL_64_fastpath+0x12/0x6f
      The bug is fixed in the patch and the testsuite is extended in ipset
      to check cidr handling more thoroughly.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
  5. 14 Jun, 2015 5 commits
    • Jozsef Kadlecsik's avatar
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Introduce RCU locking in hash:* types · 18f84d41
      Jozsef Kadlecsik authored
      Three types of data need to be protected in the case of the hash types:
      a. The hash buckets: standard rcu pointer operations are used.
      b. The element blobs in the hash buckets are stored in an array and
         a bitmap is used for book-keeping to tell which elements in the array
         are used or free.
      c. Networks per cidr values and the cidr values themselves are stored
         in fix sized arrays and need no protection. The values are modified
         in such an order that in the worst case an element testing is repeated
         once with the same cidr value.
      The ipset hash approach uses arrays instead of lists and therefore is
      incompatible with rhashtable.
      Performance is tested by Jesper Dangaard Brouer:
      Simple drop in FORWARD
      Dropping via simple iptables net-mask match::
       iptables -t raw -N simple || iptables -t raw -F simple
       iptables -t raw -I simple  -s -j DROP
       iptables -t raw -D PREROUTING -j simple
       iptables -t raw -I PREROUTING -j simple
      Drop performance in "raw": 11.3Mpps
      Generator: sending 12.2Mpps (tx:12264083 pps)
      Drop via original ipset in RAW table
      Create a set with lots of elements::
       sudo ./ipset destroy test
       echo "create test hash:ip hashsize 65536" > test.set
       for x in `seq 0 255`; do
          for y in `seq 0 255`; do
              echo "add test 198.18.$x.$y" >> test.set
       sudo ./ipset restore < test.set
      Dropping via ipset::
       iptables -t raw -F
       iptables -t raw -N net198 || iptables -t raw -F net198
       iptables -t raw -I net198 -m set --match-set test src -j DROP
       iptables -t raw -I PREROUTING -j net198
      Drop performance in "raw" with ipset: 8Mpps
      Perf report numbers ipset drop in "raw"::
       +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
       -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
          - _raw_read_lock_bh
             + 99.88% ip_set_test
       -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
          - _raw_read_unlock_bh
             + 99.72% ip_set_test
       +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
       +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
       +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
       +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
       +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
       +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
       +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
       +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
       +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
       +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
       +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
       +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
       +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
       +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
       +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
       +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
       +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
       +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock
      Drop via ipset in RAW table with RCU-locking
      With RCU locking, the RW-lock is gone.
      Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps
      Performance-tested-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Fix parallel resizing and listing of the same set · c4c99783
      Jozsef Kadlecsik authored
      When elements added to a hash:* type of set and resizing triggered,
      parallel listing could start to list the original set (before resizing)
      and "continue" with listing the new set. Fix it by references and
      using the original hash table for listing. Therefore the destroying of
      the original hash table may happen from the resizing or listing functions.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Fix cidr handling for hash:*net* types · f690cbae
      Jozsef Kadlecsik authored
      Commit "Simplify cidr handling for hash:*net* types" broke the cidr
      handling for the hash:*net* types when the sets were used by the SET
      target: entries with invalid cidr values were added to the sets.
      Reported by Jonathan Johnson.
      Testsuite entry is added to verify the fix.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
    • Sergey Popovich's avatar
  6. 13 May, 2015 3 commits
  7. 03 Dec, 2014 4 commits
  8. 15 Sep, 2014 2 commits
  9. 10 Sep, 2014 1 commit
  10. 24 Aug, 2014 1 commit
  11. 06 Mar, 2014 2 commits
  12. 27 Oct, 2013 1 commit
  13. 22 Oct, 2013 1 commit
  14. 30 Sep, 2013 11 commits