Mira Scripting Benchmarks

How fast should your scripts run? If you read the Lua literature or follow user groups, you'll often read about Lua's remarkable speed. The Script library is designed so that most numerically intensive processing uses Mira's core library of highly optimized array functions. However, there are times when you need to do intensive numeric processing inside the script itself, as when working with large tables of data, or processing millions of values using your own script code. So you aren't left in the dark about what "fast" means, here are some benchmarks we've measured at Mirametrics. The table below suggests what to expect from your numerically intensive scripts. Many benchmarks list the script source code used. A description of the test procedures and test machine is given at the bottom.

NOTE: Some of these capabilities are available only in MX Script.

Benchmark Time (sec) Speed
Create a 64-bit real image from an array of 1 million elements using I:Set(t). 0.079112.7 million elements / sec
Create a 64-bit real image from an array of 250,000 elements using I:Set(t). 0.018113.8 million elements / sec
Create a 64-bit real image from an array of 10,000 elements using I:Set(t).0.000766 13.1 million elements / sec
Set 1 million elements in a lua table using
t={}; for k=1,1000000 do t[k]=k end
0.220 4.54 million elements / sec
Set 10 million table elements in a global table using
t={}; for k=1,10000000 do t[k]=k end
2.53 3.95 million elements / sec
Create and set 1 million elements in a local table using
local t={}; for k=1,10000000 do t[k]=k end
0.09610.4 million elements / sec
Perform 10 million multiply's using local values:
local n=0; local m=0; for i=1,10000000 do k=n*m end
0.314 31.9 million multiply's / sec
Perform 10 million divides using local values:
local n=0; local m=0; for i=1,10000000 do k=n/m end
0.323 31.0 million divides's / sec
Perform 10 million adds using local values:
local m=0; local n=0; for i=1,10000000 do k=n+m end
0.316 31.6 million adds / sec
Perform 10 million additions using global values:
n=0; m=0; for i=1,10000000 do k=n+1000 end
0.898 11.1 million adds / sec
Perform 10 million additions using global values:
k=0; for i=1,10000000 do k=k+1 end
0.809 12.4 million adds / sec
Perform 10 million empty loops:
k=0; for k=1,100000000 do end
0.177 56.5 million loops / sec
Perform 10 million divides and save in a local array:
local t={}; local m=3; for i=1,10000000 do t[k]=k/m end
1.46 6.85 million / sec
Least squares solution of 100 points with 4 parameters and 3 variables using a "hyperplane" basis function declared in the script 0.042 24 fits / sec
Least squares solution of 100 points with 4 parameters and 3 variables using internal "hyperplane" basis function 0.000556 1,800 fits / sec
Least squares solution of 10 points with 4 parameters and 3 variables using internal "hyperplane" basis function 0.000055 18,000 fits / sec
Least squares solution of 1000 points using a 3x2 (6 term) 2-D polynomial. 0.0008 1,250 fits / sec
Least squares solution using CLsqFit class to fit 10 points with a 3x2 (6 parameter) 2-D polynomial. 0.000041 24,400 fits / sec
Least squares solution using a 6 term polynomial to fit 1000 points. This example uses the simple global function TFit, although greater versatility is available using the CLsqFit class. The data are contained in an array t. This function returns up to 4 results: an array of coefficients, array of errors, the fit standard deviation, and the sample mean:
t = {} c,e,s,m = TFit(t,6)
0.0011 900 fits /sec
Create 1 million uniformly distributed random numbers.
t = TRand(1000000)
0.243 4.1 million numbers / sec
Create 1 million Gaussian distributed random numbers.
t = TGaussDev(1000000)
1.16 862,000 numbers / sec
Histogram of 1 million real numbers using 100 bins. Adopting other than default parameters requires using class methods, as shown here:
H = NewHist() H:SetBins(100) H:Calc()
0.167 6 million numbers / sec
Histogram of 1 million real numbers, pre-sorted. This uses the global THist fuuction, which is the function that is benchmarked:
t = TRand(1000000) TSort(t) THist(t)
0.093 11 million numbers / sec
Add two 1000x1000 64-bit real images0.00425 236 images / sec
Add two 1000x1000 32-bit real images 0.00198 500 images / sec
Add two 1000x1000 16-bit integer images 0.00142 704 images / sec
Add two 1000x1000 24-bit RGB images0.00414 242 images / sec
Add two 1200x960 48-bit URGB images0.00444 225 images / sec
Multiply 1000x1000 32-bit real images 0.0033 300 images / sec
Multiply 1000x1000 32-bit real image by a number 0.00475 210 images / sec
Divide two 1000x1000 32-bit real images 0.0094 106 images sec
Start with image I[1] which is a 1000x1000 pixel 16-bit image. Convert it to "float" data type and then do various arithmetic operations on it:

I[1]:SetDatatype("float")
I[2] = I[1] + 1000
I[3] = I[1] / I[2]
I[4] = I[1] ^ I[3].

All 4 images use 32-bit real pixel type. This process involved creation of 4, 4MB images as well as the image mathematical operations between them. The last operation raised I[1] to the power of I[3], pixel by pixel (a very CPU-intensive computation). The benchmark includes all 4 steps.
0.097 10.3 million pixels / sec
Same as above plus display all 4 final images in a new image window. This includes computation of an image histogram, transfer function, and palette mapping for each image. 0.369 2.7 images / sec
Load 1 Megapixel image of 16 bit pixels from hard drive, compute complete image histogram and auto-scale transfer function using gamma=0.6, then display in a new window. 0.125 8 images / sec

Conclusions

If there are any major results to be gleaned from the table above, they are as follows:

  • Mira's use of Lua provides a high-performance scripting language.
  • Declare local values whenever possible. This has 2 advantages: speed when used many tmes in a loop, and it prevents the global namespace from being polluted by different values with conflicting names.

Testing Procedure and the Test Machine

test machine used was chosen to be representative of a typical "fast" machine in use by Mira users. This machine uses a 3.0 GHz Pentium Core-2 Duo E-6850 CPU with a 1333 MHz front-side bus and 4 GB of 800 MHz DDR-2 RAM. The operating system was Windows XP/SP3. Screen applications that also were open during these tests included Visual Studio 2008, Calculator, Outlook, Windows Explorer, and Mira MX Ultimate Edition. To increase the significance of the benchmarks, most procedures were repeated in a loop of 10 to 10000 cycles and the time value was divided accordingly. Each timing was then repeated 3 to 10 times and the typical value, rather than the lowest value, was adopted as the benchmark.


Home     Products     Briefs     Tech Notes     Support     Purchase     News     Web Tools     Info     Contact