Seccomp for Developers Making apps more secure Alexander Reelsen alex@elastic.co | @spinscale

Agenda What is seccomp and why should I care as a developer? Using Seccomp in high level languages (Java, Crystal, Python) Monitoring seccomp violations

Product Overview

Elastic Stack building & lego blocks seccomp features used in Elasticsearch & Beats

Security is a requirement High adoption Providing software vs. operating it No assumptions about environment (AppArmor, SELinux) Multiple layers (Java Security Manager and seccomp)

What is seccomp?

What’s the problem? Run untrusted code in your system No virtualization, but isolation Limit code to prevent certain dangerous system calls

History lesson 2005/2.6.12: strict mode allowing only read , write , exit and sigreturn system calls, use via proc file system 2007/2.6.23: Added new prctl() argument 2012/3.5: Allow configurable seccomp-bpf filter in prctl() call 2014/3.17: Own seccomp() system call

Seccomp users Elasticsearch & Beats Docker, systemd, Android Chrome, Firefox OpenSSH firecracker

How does this work? Process tells the operating system to limit its own abilities A management process does the same before start up (i.e. systemd) One-way transition The list of allowed/blocked calls is called a seccomp filter

Usage prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, prog); or seccomp(SECCOMP_SET_MODE_FILTER, 0, &prog)

Simple Example firejail —noprofile —seccomp.drop=bind -c strace nc -v -l -p 8000 check the bind() system call in the output…

Simple Example firejail —noprofile —seccomp.drop=bind -c strace nc -v -l -p 8000 check the bind() system call in the output… socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_REUSEPORT, [1], 4) = 0 bind(3, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr(“0.0.0.0”)}, 16) = ? +++ killed by SIGSYS +++

Check dmesg output [ 535.197019] audit: type=1326 audit(1592235264.942:94): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6664 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 arch=c000003e syscall=49 compat=0 ip=0x7ffb85de7497 code=0x0 [ 535.197022] audit: type=1701 audit(1592235264.942:95): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6664 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 res=1

Use ausearch (part of auditd) Run sudo /usr/sbin/ausearch —syscall bind time->Mon Jun 15 15:38:32 2020 type=SECCOMP msg=audit(1592235512.578:148): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6939 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 arch=c000003e syscall=49 compat=0 ip=0x7f67398a0497 code=0x0

Hard to read time->Mon Jun 15 15:38:32 2020 type=SECCOMP msg=audit(1592235512.578:148): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6939 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 arch=c000003e syscall=49 compat=0 ip=0x7f67398a0497 code=0x0 type: type of event msg: timestamp and uniqueid (can be shared among several records) auid: audit user id (kept the same even when using su - ) uid: user id gid: group id ses: session id

Hard to read time->Mon Jun 15 15:38:32 2020 type=SECCOMP msg=audit(1592235512.578:148): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6939 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 arch=c000003e syscall=49 compat=0 ip=0x7f67398a0497 code=0x0 subj: SELinux contest pid: process id comm: commandline name exe: path to the executable sig: 31 aka SIGSYS arch: cpu architecture

Hard to read time->Mon Jun 15 15:38:32 2020 type=SECCOMP msg=audit(1592235512.578:148): auid=1000 uid=1000 gid=1000 ses=4 subj==unconfined pid=6939 comm=”nc” exe=”/usr/bin/nc.traditional” sig=31 arch=c000003e syscall=49 compat=0 ip=0x7f67398a0497 code=0x0 syscall: syscall (49 is bind() ), see ausyscall —dump compat: syscall compatibility mode, ip: ip address code: seccomp action

Why?

Run untrusted code in your system

Run untrusted code in your system Your code is untrusted code!

Run untrusted code in your system Your code is untrusted code! http://localhost:8080/cgi-bin/ping.pl?1.1.1.1 ; ls -al

Good case perl -e ‘print ping -c 1 $ARGV[0]’ 1.1.1.1

command execution perl -e ‘print ping -c 1 $ARGV[0]’ 1.1.1.1 perl -e ‘print ping -c 1 $ARGV[0]’ “1.1.1.1 ; ls -al”

command execution perl -e ‘print ping -c 1 $ARGV[0]’ 1.1.1.1 perl -e ‘print ping -c 1 $ARGV[0]’ “1.1.1.1 ; ls -al” perl -e ‘print ping -c 1 $ARGV[0]’ “1.1.1.1 || ls -al”

command execution perl perl perl perl -e -e -e -e ‘print ‘print ‘print ‘print pingping pingping -c -c -c -c 1 1 1 1 $ARGV[0]' $ARGV[0]’ $ARGV[0]' $ARGV[0]’ 1.1.1.1 “1.1.1.1 ; ls -al” “1.1.1.1 || ls -al” “1.1.1.1 && ls -al”

DoS perl perl perl perl perl -e -e -e -e -e ‘print ‘print ‘print ‘print ‘print pingping pingping ping -c -c -c -c -c 1 1 1 1 1 $ARGV[0]’ $ARGV[0]' $ARGV[0]’ $ARGV[0]' $ARGV[0]’ 1.1.1.1 “1.1.1.1 “1.1.1.1 “1.1.1.1 “1.1.1.1 ; ls -al” || ls -al” && ls -al” -c 100000”

DoS perl perl perl perl perl perl -e -e -e -e -e -e ‘print ‘print ‘print ‘print ‘print ‘print pingping pingping pingping -c -c -c -c -c -c 1 1 1 1 1 1 $ARGV[0]' $ARGV[0]’ $ARGV[0]' $ARGV[0]’ $ARGV[0]' $ARGV[0]’ 1.1.1.1 “1.1.1.1 “1.1.1.1 “1.1.1.1 “1.1.1.1 “1.1.1.1 ; ls -al” || ls -al” && ls -al” -c 100000” -c 100000 > /tmp/foo”

Running as root! $ ls -l /bin/ping -rwsr-xr-x 1 root root 78168 Feb 16 Hint: Ensure iputils-ping is installed 2019 /bin/ping

Which processes are using seccomp right now? # for i in $(grep Seccomp /proc/*/status | grep -v ‘0$’ | cut -d’/’ -f3) ; do ps hww $i ; done 16708 221 243 345 6034 pts/1 ? ? ? ? 6371 ? S+ Ss Ss Ss Ssl Ssl 0:00 0:01 0:00 0:00 9:48 python3 python-seccomp/app.py -s /lib/systemd/systemd-journald /lib/systemd/systemd-udevd /lib/systemd/systemd-logind /usr/share/elasticsearch/jdk/bin/java … org.elasticsearch. bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid —quiet 4:47 /usr/share/auditbeat/bin/auditbeat -environment systemd -c /etc/auditbeat/auditbeat.yml -path.home /usr/share/auditbeat -path.config /etc/auditbeat -path.data /var/lib/auditbeat -path.logs /var/log/auditbeat

Seccomp filters A set of rules to check every system call against Written in BPF (no loops or jumping backwards, dead code detection, directed acyclic graph) BPF filtering is done in kernel space (efficient) Possible outcomes system call is allowed process/thread is killed an error is returned to the caller

Using seccomp in Java Java has the ability to call native code! See Elasticsearch’s SystemCallFilter.java

BPF magic in Java // BPF installed to check arch, limit, then syscall. // See https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt for details. SockFilter insns[] = { /* 1 / BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_ARCH_OFFSET), / 2 / BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, arch.audit, 0, 7), / 3 / BPF_STMT(BPF_LD + BPF_W + BPF_ABS, SECCOMP_DATA_NR_OFFSET), / 4 / BPF_JUMP(BPF_JMP + BPF_JGT + BPF_K, arch.limit, 5, 0), / 5 / BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, arch.fork, 4, 0), / 6 / BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, arch.vfork, 3, 0), / 7 / BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, arch.execve, 2, 0), / 8 / BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, arch.execveat, 1, 0), / 9 / BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_ALLOW), / 10 */ BPF_STMT(BPF_RET + BPF_K, SECCOMP_RET_ERRNO | (EACCES & SECCOMP_RET_DATA)), }; // // // // // // // // // // if (arch != audit) goto fail; if (syscall > LIMIT) goto fail; if (syscall == FORK) goto fail; if (syscall == VFORK) goto fail; if (syscall == EXECVE) goto fail; if (syscall == EXECVEAT) goto fail; pass: return OK; fail: return EACCES;

// seccomp takes a long, so we pass it one explicitly to keep the JNA simple SockFProg prog = new SockFProg(insns); prog.write(); long pointer = Pointer.nativeValue(prog.getPointer()); int method = 1; // install filter, if this works, after this there is no going back! // first try it with seccomp(SECCOMP_SET_MODE_FILTER), falling back to prctl() if (linux_syscall(arch.seccomp, SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, new NativeLong(pointer)) != 0) { method = 0; int errno1 = Native.getLastError(); if (logger.isDebugEnabled()) { logger.debug(“seccomp(SECCOMP_SET_MODE_FILTER): {}, falling back to prctl(PR_SET_SECCOMP)…”, JNACLibrary.strerror(errno1)); } if (linux_prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, pointer, 0, 0) != 0) { int errno2 = Native.getLastError(); throw new UnsupportedOperationException(“seccomp(SECCOMP_SET_MODE_FILTER): ” + JNACLibrary.strerror(errno1) + “, prctl(PR_SET_SECCOMP): ” + JNACLibrary.strerror(errno2)); } } // now check that the filter was really installed, we should be in filter mode. if (linux_prctl(PR_GET_SECCOMP, 0, 0, 0, 0) != 2) { throw new UnsupportedOperationException(“seccomp filter installation did not really succeed. seccomp(PR_GET_SECCOMP): ” + JNACLibrary.strerror(Native.getLastError())); }

// try seccomp() first linux_syscall(arch.seccomp, SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, new NativeLong(pointer)); // if seccomp() fails due to old kernel, try prctl() linux_prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, pointer, 0, 0); // ensure filter was successfully installed linux_prctl(PR_GET_SECCOMP, 0, 0, 0, 0);

Using JNA Java Native Access Access native shared libraries without JNI Multi platform

Using seccomp in Go (with libbeat)

Using seccomp in Go (with libbeat) package seccomp import ( “github.com/elastic/go-seccomp-bpf” ) func init() { defaultPolicy = &seccomp.Policy{ DefaultAction: seccomp.ActionErrno, Syscalls: []seccomp.SyscallGroup{ { Action: seccomp.ActionAllow, Names: []string{ “accept”, “accept4”, “access”,

Using seccomp in Crystal

Using seccomp in Crystal require “seccomp/seccomp” class SeccompClient < Seccomp def run : Int32 ctx = uninitialized ScmpFilterCtx ctx = seccomp_init(SCMP_ACT_ALLOW) # stop executions seccomp_rule_add(ctx, seccomp_rule_add(ctx, seccomp_rule_add(ctx, seccomp_rule_add(ctx, SCMP_ACT_ERRNO, SCMP_ACT_ERRNO, SCMP_ACT_ERRNO, SCMP_ACT_ERRNO, seccomp_syscall_resolve_name(“execve”), 0) seccomp_syscall_resolve_name(“execveat”), 0) seccomp_syscall_resolve_name(“fork”), 0) seccomp_syscall_resolve_name(“vfork”), 0) seccomp_load(ctx); ret = seccomp_export_pfc(ctx, STDOUT_FILENO) # optional, dump policy on stdout #printf(“seccomp_export_pfc result: %d\n”, ret) seccomp_release(ctx) ret < 0 ? -ret : ret end end

Using seccomp in Python

Using seccomp in Python from seccomp import * def setup_seccomp(): f = SyscallFilter(ALLOW) # stop executions f.add_rule(ERRNO(errno.EPERM), f.add_rule(ERRNO(errno.EPERM), f.add_rule(ERRNO(errno.EPERM), f.add_rule(ERRNO(errno.EPERM), f.load() print(f’Seccomp enabled…’) “execve”) “execveat”) “vfork”) “fork”)

Demo

Monitoring seccomp violations

Summary seccomp is a great mechanism, battle tested Other operating systems have similar features under different names easy to implement, also in high level languages Packages in python, crystal, Go, Rust, Perl - none uptodate for ruby and node If there is no package, you can still create a profile using firejail, but…

Integrate seccomp natively in your app

Native integration No way of disabling Abort if storing the filter did not succeed Perfect if you do not control the environment

Do not roll your own security

Rethink your design… Validate inputs Do not implement your own security mechanisms! Do not call binaries in your apps Think about proper isolation

… by isolating Different processes Proper isolation (dropping privileges) No network connection Optional Authentication Additional operational complexity

Thanks for listening Q&A Alexander Reelsen Community Advocate alex@elastic.co | @spinscale

Check out Elastic Security SIEM Endpoint Security XDR - Extended Detection & Response

SIEM

Resources Github Repo: seccomp-samples Tools: Auditbeat Blog post: Seccomp in the Elastic Stack Docs: Kernel seccomp documentation & seccomp manpage Auditd: Understanding audit log files Blog post: Elasticsearch - Securing a search engine while maintaining usability Talk: seccomp - your next layer of defense Libraries: libseccomp including python integration, go-seccomp-bpf, seccomp.cr for Crystal

Community & Meetups https://community.elastic.co

Discuss Forum https://discuss.elastic.co

Thanks for listening Q&A Alexander Reelsen Community Advocate alex@elastic.co | @spinscale