Thursday, October 06, 2005

Security in a language?

Lately I've taken more notice into the debates over programming languages. People often claim that Java is inherently more secure than C; C is faster than Java; C++ is easier than C; C++ is slow and has an over-bloated syntax that makes it confusing; or any number of other things about languages. Looking at C and Java, I'd like to make a quick point.

In gcc 4.1, a re-implementation of ProPolice is included to help squelch stack smashes. OpenBSD has a new secure heap manager that does a similar job in the heap. Then there's PaX with strict but light-weight memory protections; as well as GrSecurity, a project that aims to be a complete security solution built on top of PaX.

Just a few basic enhancements that bring a lot with them. On top of a typical system, ProPolice and the secure heap manager both not only stop security attacks, but report enough specific debugging information to almost trivialize finding and fixing the bugs. PaX stops remote code injection and ret2libc cold, knocking off the basic building blocks of these attacks. GrSecurity finishes up with a few interesting restrictions, including some extreme information separation in /proc and an enhancement to prevent /tmp races, as well as a full mandatory access control system like the more familiar SELinux.

The results are nice, to say the least. C and C++ programs are immunized against stack smashes and heap overflows, as well as code injection and out-of-order execution in general. This alone cuts out over half of the security bugs caused in these languages based on frequency, according to some analysis of the first 60 security announcements from Ubuntu Linux. This includes stack and heap buffer overflows, integer overflows, and most other memory corruptions. And the cost of all this? Around a percent or two, no more, of increase in CPU load.

What of Java or Mono? Well to start with, these programs run on top of a JIT or JVM typically. This demands that the strict data-code separation in PaX is disabled, resulting in a slightly weakened security model. On top of it, Java platforms assume that Java arrays can't be overflowed or double-freed, because the language is bounds checked and garbage collected; however, there is always the slight possibility that in the many tens or hundreds of thousands of lines of added code mimicking the functions of the operating system, a slight mistake can lead to the possibility of Java byte code that forces a double-free or internal overflow, leading to code injection.

C also carries with it a platform, a very thin one though, the "C runtime" or "C standard library." Although small, it presents the same issues as the Java platform; it's just fractionally worrisome because of its smaller size, and inherent security issues are better understood at this point. C++, Objective-C, and other languages build on top of the C runtime and create the problem of an expanded runtime again, though not to the epic proportions of full platforms like Java or Mono. In addition, these runtimes can still be protected by the same enhancements that protect C, although some may have other unexpected attack vectors.

There's a lack of convenience in full platform systems that hasn't been discussed, and has little to do with security. Java and Mono both isolate the program from the underlying system; because of this, they need bindings or reimplementations of common libraries. For example, there must be a class that supplies either a Java implementation of Ogg Vorbis or binds to the native libvorbisfile.so or libvorbisfile.dll to use Ogg Vorbis for Java applications to use Ogg Vorbis. This is mildly aggravating; but also grows the unprotected code base, and forces C libraries used via bindings to run without some security enhancements in JIT and JVM implementations.

There's one more effect that the security enhancements have on C and C++ programs. The increased restrictions tend to expose bugs rather spectacularly; once in a while a program will go from "acting weird sometimes" to simply hard-crashing when introduced into a secure environment. Not only does this force programmers to fix these bugs; but with the stack and heap protections, it even helps them along the way. In essence, the difficulty of debugging a C program can even be reduced by these; though not necessarily below the cost of debugging or writing up a Java or C# program, unless the needed libraries aren't available on those platforms.

So now, which language is more secure? I still say C, for no other reason than because it's easier to make the system protect itself from broken C programs than broken JVMs or Java applets. Assuming the JVM itself is perfect, however, I'm willing to say that C and Java are about on equal ground.

11 Comments:

Blogger lazythinker said...

Your analysis here seems skewed towards what you happen to be working on and away from any real-world data.

The risks in the JVM are also much smaller than you are assuming simply because the vast majority of the JVM's base code is implemented in Java. There's a small core which handles the array bounds checking, memory management, etc, but the rest of the rather large Java core libraries are either completely in Java (and hence are within the same safe "sandbox" as your own code), or a few small bits are in C but leverage the same memory management code. That's why you *don't* see the history of Java exploits, insecure JVMs, etc, that you are hinting at. The Java libraries don't have to be "perfect" -- of course, they have bugs -- but those bugs result in broken functionality, not buffer overruns.

Then there's the question of how many programmers out there are actively following the steps you are to secure their C or C++ code... because if they *aren't*, it's pretty obvious that if you have a C programmer who does nothing "extra" and a Java programmer who does nothing extra, the Java webapp will be much more secure.

(BTW... hello from /.)

11:08 AM  
Blogger Mike Täht said...

I largely agree with everything you write above and disagree with everything in the comment above.

The risks in the JVM are also much smaller than you are assuming simply because the vast majority of the JVM's base code is implemented in Java.

For anyone who has tried to make java do anything useful in domains other than webserving, a vast amount of a complex java application's base code is glue to external C and C++ libraries - see "eclipse" for one particularly large example. That glue is a large potential source of bugs, security holes and other problems.

11:30 AM  
Blogger rjf said...

@mike

It's amazing to me that after all these years, there is still this perception that large, complex systems cannot be written in pure java, but that it's only good for "webserving"

I have been involved in writing such systems in both C/C++ and Java. Each has some advantages, but Java is a clear winner in reducing development time, bug fixing cycles and number of serious faults in the resulting system.

12:04 PM  
Blogger Metal said...

This reads like a troll to me.
Still, let's use a quick example:
You're trying to argue that a program like

int main() {
printf(getenv("QUERY_STRING"));
return 0;
}

is just as safe as

public class SomeServlet extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
response.getWriter().println(request.getQueryString());
}
}

Apparently your argument is that if the C program is built on just the right version of the right compiler with the right extensions, it won't be vulnerable to a glaring remotely exploitable security bug.
Hence C is at least as secure as Java.

You must be kidding.

Security as an afterthought can never hope to match a good initially secure design.

(On the other hand, both the Java an C programs above are vulnerable to an XSS attack, which goes to show Java programs are not immune from security issues caused by poor design either.)

7:52 PM  
Blogger KMag said...

One of the ideas behind high-level virtual machines such as the JVM is to take tasks that programmers have historically done very poorly, and put this code in one place where it can be carefully audited before being heavily reused by all of the code that runs on top of the VM.

It's about the number of lines of code that must be audited in order to get a reasonable assurance that certain types of bugs will not occur. (In some domains with certain types of bugs, "reasonable assurance" means "documented formal proof".)

Once you've audited the few lines of code in the bytecode interpreter that check array bounds and the few lines of code in the JIT that insert array bounds checks, you can be sure that reads or writes out of bounds will not succeed. Auditing a large C program for correct bounds chekcing means auditing each and every place a pointer or array is used.

The same goes for double-frees and type safety. A relatively small number of lines of code in the VM must be audited in order to ensure correctness (security is a subset of correctness) instead of needing to audit every line of C code ever written for double-frees and incorrect downcasts.

Advocating manual bounds checking and memory management ammounts to advocating against code reuse in tasks that have historically proven to be difficult for most programmers to get correct.

You could argue that programmers should only use open-source (or at least source-available) VMs, and audit the security-critical portions of the code themselves. However, it's hard to argue that programmers should regularly manually re-implement commonly buggy operations.

Large runtimes mean large bodies of re-usable code. If you haven't audited portions of the rutime or VM, then don't use those portions in critical systems. Demand the source code and audit the code yourself rather than throwing the baby out with the bath water.

That being said, improrper code reuse has caused many problems, mostly through poorly documented or ignored pre-conditions, post-conditions, and side-effects. However, when done carefully, code reuse is a large net positive for software reliability. High-level virtual machines such as the JVM encourage (and sometimes force) code reuse in areas that have proven to be historically bug-ridden.

4:00 AM  
Blogger KMag said...

Sorry if this shows up as a duplicate... my submission over 6 hours ago hasn't showed up yet. (Yes, I've flushed my browser cache and re-loaded.)

The issue is the number of lines of code (assuming rougly equivalent complexity of lines of code) that must be audited in order to be reasonably sure that your code is free from a certain type of bug. (In some domains, "resonably sure" means "formally proven with full documentation" for certain types of bugs.)

With Java, some of the most complicated and historically bug-ridden code is encapsulated in the JVM and re-used by every single program, rather than spread throughout the code, doomed to be reimplemented by every software intern. It's about isolating the parts that people historically are bad at programming, auditing them, and heavily re-using the highly audited implementations.

Advocating auditing your own memory management, bounds checking, etc. in C is equivalent to advocating ad-hoc reimplimentations over code re-use. Sure, there are tools to help you audit your code, but you're still auditing countless implementations of code that would be implemented just once or twice in a JVM.

If you don't trust your virtual machine and/or runtime, then the solution is to use a FOSS (or at least source-available) VM/runtime and audit the security critical portions yourself. On the other hand, very few people would fault you for using the JVM and Java runtime from the EAL-4 version of Solaris (if it contains a JVM and Java runtime... I'm not sure on that part) and just assuming that a Department of Defense EAL-4 audit is more rigorous that the audit you would perform yourself.

C and C++ use manual array bounds checking. Every line of code that uses arrays or pointers must be audited in order to assure reads and writes are in bounds. Once you've audited the (very few) lines of code in the bytecode interpreter and the JIT responsible for inserting bounds checks, you can be sure that out-of-bounds reads and writes will fail in all of your Java code.

C and C++ are weakly statically typed languages. In other words, you can completely bend the type systems by upcasting to (void *) and then perform uchecked downcasts. After you've audited the class loader, you can be sure your Java code is free from such manglings of the type system. The same sort of assurances in C/C++ require auditing all of your code. The author of TFA says that Java loses the strict data-code seperation of PAX. However, the JVMs strict enforcement of type safey already creates strict data-code seperation.

C and C++ use manual allocation and freeing of memory. In very complicated cases, engineers may have no choice but to use third-party automatic memory managers, or implement their own reference counting or other memory management systems. Proving that code does not perform double-frees can be very complicated. On the other hand, with Java code, only an audit of the JVM's garbage collector is required. (Depending on the exact type of garbage collector used, some other code may need to be audited to ensure that some of the representational invariants of the internal object representation are not violated.)

10:37 AM  
Blogger Mike Täht said...

@rjf: I agree that large, complex programs can be written in java, outside of the web domain. My point was, however, that bugs accumulate at the interfaces between programmers, and until the entire world of useful libraries are rewritten in Java (which I hope you'll grant, is an unlikely event), then the glue logic between java and those libraries will remain a huge potential source of bugs and security problems. Given that a lot of this glue is unauditable (due to the java source code being unpublished), and the library interfaces always evolving, it's a nightmare.

That's my core problem with Java on everything.

@Metal:

As for the code reduction involved in programming Java , your example showed that you can write a buggy program in C in a lot less lines than a buggy program in Java! :)

It's possible to write bad code in java - one of the failing java projects I had to fix once upon a time had 84% of the "try ... catch" blocks doing nothing when they caught an exception.

Just because safe structures are in the language does not mean they will be effectively used.

Just because you "know java" does not mean you will also program better, or more efficiently.

Another anecdotal story: I met a young programmer once that was trying to convince her java code to easily copy files around to various machines. She had architected a complete and several hundred line solution to the generic problem of copying files around... but ran into the problems of non-integrated-into-java authentication and file permissions on the systems involved... and wanted her application to run as root in order to bypass them!

I replaced her code with 'system("scp filea user@machinea:fileb")'

When I cope with java's limitations I'm always reminded of Pascal - whose println construct was hopelessly inadaquate to the problem space. You get 90% of the solution coded in java and then fall off a cliff because the needed functionality isn't there for some critical feature.

Java continues to improve - I was pleased to see that non-blocking I/O was added to the language recently.

Perhaps, after gcj matures a while longer, I'd be willing to do another project in it.

3:07 PM  
Blogger wow power leveling said...

Why was there no follow on bankruptcy then? The bailout of AIG FP went to (wow power leveling) hedge funds that bound credit swaps on Lehman failing or others betting on rating (wow power leveling) declines. AIG has drained over 100 billion from the government. Which had to go to those who bet on failures and downgrades. Many of whom (power leveling)were hedge funds. I-banks that had offsetting swaps needed the money from the AIG bailout or they would have been caught. Its an (wow powerleveling) insiders game and it takes just a little bit too much time for most people to think (wow gold) through where the AIG 100 billion bailout money went to, hedge funds and players, many of whom hire from the top ranks of DOJ, Fed, Treasury, CAOBO
wow goldwow goldwow goldwow gold CAOBO

9:53 PM  
Blogger ally said...

MBT will not only change,MBT boots,the way you use your MTB Shoes,What are the benefits?christian dior,Free shipping and free return shippingdior shoes,on all diorIncluded with each pair ofdior handbags,an instructional DVDWhat is Dior?dior sunglasses,look at another good paoduct such as Dior totesDo not worryThe urge to buy these goodsnew balancerevolutionary fitness aid from Swiss Masai,new balance shoes,which may help reduce cellulite andnew balance outlet,transforms flat hard artificial surfaces Puma Shoes, with top quality and cheap price. puma outlet,innovative sole design includes thePuma Sneaker,Here you can buy wide rangewholesale cl high heel sandalsquality and cheap car GPS navigation systemsMoncler,Very Cool, Comfortable and lightmoncler jacketsoriginal packing you can rest assured.moncler coatsYou might say that thediscount moncler vestNo one ever thought bothmoncler outlet,As with everything that comes fmonmoncler t-shirtThe new store has been decorated.north faceWe are offering you a wide range ofnorth face outletSome color combinations seem to getnorthfaceBut within the same community north face jacketsbecause it features just the right amount of north face coats,A little of these are given below.ugg bootswas a very well-known French fashionable boot,cheap ugg bootsbecause of the wisdom featuresdiscount ugg bootswhich you are buying is uniqueclassic ugg bootsthey obtain materials from domestic suppliers ugg classic tall boots,A stroll around the park with the GHD IV Salon Styler,We are offering you coach outlet,our store has been decorated coach handbags,Nicecoach totes

3:19 AM  
Blogger office said...

The Tax Return Crack-Up<4>
Realizing he might have dug himself in there,Microsoft Office 2010the general emphasized that Office 2010he had spent some time as a junior Office 2007officer working "very closely Microsoft Officewith the Israeli air force" and that heMicrosoft Office 2007had found that "more cosmopolitan,Office 2007 key liberal version of the Israeli population" Office 2007 downloadto be just chock full Office 2007 Professionalof that sort of "goodwill" necessary Windows 7to give a bunch of land back Microsoft outlook 2010to the Palestinians.

4:02 AM  
Blogger hường lê said...

Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

Discover our website bounty of free online games now!
Our website has the biggest collection of free online games. Totally new games are added every day!

age of war 2
gold Miner 2
unfair Mario 2
cubefield 2
tanki Online 2

9:54 PM  

Post a Comment

<< Home