Spoiler: The chipKIT Uno32 won. :)
Test Setup
To benchmark these boards, I used an excerpt from the Arduino Show Info program. The sketch includes several speed tests that examine GPIO manipulation speed and raw computational speed. While each of the 32-bit boards tested is very fast, optimization in the IDE also comes into play when measuring performance. Even a 84MHz processor won't make up for poor code structure and inefficient register manipulation in the libraries. So instead of just comparing specs, let's put each board through tests that measure completion time of commonly used program functions.
For the Arduino boards, I used Arduino IDE 1.6.7. For the chipKIT Uno32, I used Mpide 0023.
Benchmarking Program Download: Shared on Github
Please examine this program to see exactly how each test is structured.
Results
Here are the benchmarking results. Each data point is the time to completion of the test in microseconds. Faster is better. The fastest board in each test is highlighted in green. I also highlighted two concerning results. Two of the tests could not be run on two of the boards due to compilation problems, as indicated.
The results of the speed tests.
Analysis
The Arduino Due and chipKIT Uno32 were the clear winners here. Despite the release of newer Arduino boards, the Due can still hold its own in raw computational speed. There is no doubt that it will be relevant for some time to come with its SAM3X8E microcontroller and large number of GPIO. The dark horse in the test, the chipKIT Uno32, did extremely well. In fact, I have to say it was the winner. It had very fast and consistent GPIO manipulation speed, awesome floating point performance, and it was nearly the equal of the Due in integer math. How much of this is due to the microcontroller versus IDE optimization? I cannot say. But hats off to the creators of that nice little board. It seems to be discontinued, but you can still find it for sale at about $15 cheaper than a Due!
The Arduino 101 did pretty well considering the lower clock speed. It is not a "Due killer" and was never intended to be. While the raw computational speed isn't as fast as I was hoping for, we may see improvements for this board as the IDE is optimized and we get access to the RTOS under the hood. Also, keep in mind that speed is only part of the story. Do any of the other boards have Bluetooth and an accelerometer on board? No. As a potential Arduino Uno successor, the Arduino 101 is a good addition to the lineup.
Speaking of an Arduino Uno successor, the Arduino Zero has been billed as just such a board. I don't have an official Zero, but I did test Sparkfun's take on it, their SAMD21 Dev Board. Sadly, I was a bit disappointed. Sure the SAMD21 chip turned in some good numbers, but I feel that the Arduino Zero is too expensive for what it offers. It is slightly more expensive than a Due with lower performance and less GPIO. With the Arduino 101 out at $25 less in price, where does the Zero fit in? Less expensive Zero-compatible boards might be good alternatives if you want to explore the Cortex M0+ SAMD21 microcontroller.
The Arduino Uno rocked it! Well, ok, it looks pretty slow compared to the other boards. However, remember that it is extremely unfair to pit an 8-bit microcontroller against modern 32-bit devices in speed tests. The Uno still has enough processing power for the majority of hobbyist projects. It is a classic board that will be around for years to come.
There were two concerning results that I highlighted on the table. Analog read performance on the Arduino Zero clone is abysmal! There has to be something wrong in the IDE there. Nearly half a millisecond to read an analog pin is unacceptable. Hopefully it will be fixed in the near future. Also, dstostrf() speed on the Arduino 101 was far too slow. Once again, this has to be due to optimization problems in the IDE. That is not a commonly used function so I doubt many people will notice.
Conclusion
That's it! I hope you found this shootout useful. Download the sketch above and benchmark your own boards and microcontrollers, it's a lot of fun.
If you would like to see a similar shootout with small AVR, PIC and ARM microcontrollers, such as the ATtiny and STM32F030 devices, please post in the comments below. Also post any comments and corrections.
Thanks for reading!
- Dan W.
nice work! by the way, do u know how many floating point precision of the Arduino 101?
ReplyDeleteThanks for your benchmark! I just ran it on the teensy 3.6 at default 180mhz.
ReplyDeleteSpeed test
----------
F_CPU = 180000000 Hz
1/F_CPU = 0.0056 us
nop : 0.006 us
digitalRead : 0.084 us
digitalWrite : 0.192 us
pinMode : 0.192 us
multiply byte : 0.039 us
divide byte : 0.047 us
add byte : 0.039 us
multiply integer : 0.033 us
divide integer : 0.039 us
add integer : 0.033 us
multiply long : 0.032 us
divide long : 0.049 us
add long : 0.033 us
multiply float : 0.044 us
divide float : 0.124 us
add float : 0.044 us
itoa() : 0.279 us
ltoa() : 1.099 us
dtostrf() : 15.474 us
random() : 0.399 us
y |= (1<<x) : 0.027 us
bitSet() : 0.028 us
analogReference() : 0.122 us
analogRead() : 6.599 us
analogWrite() PWM : 0.534 us
delay(1) : 1000.499 us
delay(100) : 100000.000 us
delayMicroseconds(2) : 2.001 us
delayMicroseconds(5) : 5.004 us
delayMicroseconds(100) : 100.049 us
-----------
ESP32 at 160mhz with analog code removed and no analog read/write as that party of the SDK is not finished yet.
ReplyDeleteSpeed test
----------
F_CPU = 160000000 Hz
1/F_CPU = 0.0062 us
nop : 0.006 us
digitalRead : 0.216 us
digitalWrite : 0.168 us
pinMode : 0.526 us
multiply byte : 0.056 us
divide byte : 0.056 us
add byte : 0.050 us
multiply integer : 0.080 us
divide integer : 0.083 us
add integer : 0.080 us
multiply long : 0.078 us
divide long : 0.073 us
add long : 0.080 us
multiply float : 0.078 us
divide float : 1.398 us
add float : 0.078 us
itoa() : 1.083 us
ltoa() : 1.098 us
dtostrf() : 17.198 us
random() : 0.673 us
y |= (1<<x) : 0.067 us
bitSet() : 0.067 us
delay(1) : 999.998 us
delay(100) : 100000.000 us
delayMicroseconds(2) : 2.010 us
delayMicroseconds(5) :
ESP8266 at 160mhz - analogReference commented out - WDT triggers on longer delay check
ReplyDeleteSpeed test
----------
F_CPU = 160000000 Hz
1/F_CPU = 0.0062 us
nop : 0.006 us
digitalRead : 0.299 us
digitalWrite : 0.216 us
pinMode : 0.781 us
multiply byte : 0.050 us
divide byte : 0.201 us
add byte : 0.050 us
multiply integer : 0.074 us
divide integer : 0.229 us
add integer : 0.067 us
multiply long : 0.074 us
divide long : 0.224 us
add long : 0.068 us
multiply float : 0.369 us
divide float : 1.874 us
add float : 0.344 us
itoa() : 0.634 us
ltoa() : 4.599 us
dtostrf() : 22.674 us
random() : 1.274 us
y |= (1<<x) : 0.055 us
bitSet() : 0.056 us
analogRead() : 0.399 us
analogWrite() PWM : 5.314 us
delay(1) : 1009.999 us
delay(100) : 100025.000 us
delayMicroseconds(2) :
Soft WDT reset
Hi Dan, thanks for your great speed test! I was wondering if you still have a copy of the spreadsheet you used to create your comparison chart? Could you maybe upload it to Google docs and share? Maybe send it to me and I could do that? Thanks!
ReplyDeleteESP32 @240mhz with 11-29-2016 Arduino Version
ReplyDeleteSpeed test
----------
F_CPU = 240000000 Hz
1/F_CPU = 0.0042 us
nop : 0.004 us
digitalRead : 0.154 us
digitalWrite : 0.111 us
pinMode : 2.559 us
multiply byte : 0.037 us
divide byte : 0.036 us
add byte : 0.033 us
multiply integer : 0.053 us
divide integer : 0.054 us
add integer : 0.053 us
multiply long : 0.051 us
divide long : 0.049 us
add long : 0.053 us
multiply float : 0.051 us
divide float : 0.924 us
add float : 0.054 us
itoa() : 0.719 us
ltoa() : 0.699 us
dtostrf() : 11.449 us
random() : 0.449 us
y |= (1<<x) : 0.045 us
bitSet() : 0.045 us
delay(1) : 999.999 us
delay(100) : 100000.000 us
delayMicroseconds(2) : 2.006 us
delayMicroseconds(5) : 4.999 us
delayMicroseconds(100) : 99.999 us
-----------