TL DR;
I am a Python guy. I write Python at work, I write Python for fun, and I've even dabbled with writing Python outside in the fresh air. Someday I hope I can plug a keyboard into a Kindle and actually code outside comfortably. I've also been reading a textbook called Compiler Design in C lately. I've just gotten to the part where the author describes a relatively complex way of reading files with a minimum of copying.(Coming from a background that rarely cares about performance being better than "good enough," it's different to be reading about designing for high performance in the first place.)WHY??In the text the author claims that "MS-DOS read and write times improve dramatically when you read 32k bytes at a time." I had to test this, and I figured I could pit C vs. Python in a very shallow, distorted way at the same time.The SetupI originally did this test reading the same small file chunk over and over again, but I realized that this probably takes advantage of OS caching and becomes a test of this caching rather than of the speed of the two languages.So I set up an 8GB file, filled with the string "0123456789ABCDEF" over and over and over. Then, for each buffer size, the two languages do 2000 sequential reads of the file.PitfallsSequential and random reads are known to produce different characteristics. It would probably have produced better results if I had done a series of random reads instead of sequential ones.2,000 iterations is not really enough iterations to establish behavior solidly, but I didn't actually think of doing random reads until just now, and there was no way I was going to set up a 40GB file so that I could do 10,000 reads of 4MB each.I didn't do a whole lot of research into the buffering modes that Python offers for doing file reads. Some of those would make a difference. I have a feeling that normal file reads are internally buffered and copied at least once. That's a huge advantage for C, because read() is purported to allow the OS to copy straight from disk into your buffer if the buffer is the right size. At least it was allowed in 1990 when this book came out.The ResultsSo vanilla Python reads are half as fast as C's read(). Big whoop. I was expecting much worse, perhaps 5-7x slower. At least on Windows 7, these limited benchmarks indicate an optimal C buffer size somewhere between and including 32K and 1M. I'm convinced that the high read speeds below 32K for C would disappear entirely if I were doing random reads.For Python, I'm not sure what to recommend. The highest speeds were with 4K, but that just seems too low to make sense. More research required.The StuffThe Excel SpreadsheetThe Code
See, when I look at those curly braces, I just think "Why? You've got the whitespace in there already!" [:P]
But, seriously, here's some advice which I think you can take about 75% to the bank.C is and always will be the dominate language. ;)
Curly braces give it structure, and make it feel less flimsy.
I personally think that quotes is about 75% BS. Let me interrupt myself, and point out that I'm not trying to be a dick to you, I just despise python (the fact that I don't smile in my avatar makes my posts look serious and angry). =PBut yeah, there's a lot of reasons why one programming language is a better choice than another for certain situations. That guy is spoutin' hippie crap about all languages being equal! How dare he! *shakes fist*What I choose to think he means about programming languages is more like this:
Measure a programming language not by the people who use it, but by its efficacy at doing the job you need done.Your post needs more flagrant one-liners to be dickish, IMO. :)