-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
868 lines (860 loc) · 136 KB
/
index.html
File metadata and controls
868 lines (860 loc) · 136 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
</style>
<link rel="stylesheet" href="github-pandoc.css" type="text/css" />
<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
</head>
<body>
<div id="TOC">
<ul>
<li><a href="#acknowledgements">Acknowledgements</a></li>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#quick-start">Quick Start</a></li>
<li><a href="#ch:installation">Installation</a><ul>
<li><a href="#requirements-and-dependencies">Requirements and dependencies</a></li>
<li><a href="#obtaining-opencal">Obtaining OpenCAL</a></li>
<li><a href="#structure-of-the-distribution-directory">Structure of the Distribution Directory</a><ul>
<li><a href="#generating-makfiles">Generating makfiles</a></li>
</ul></li>
<li><a href="#build-and-installuninstall">Build and Install/Uninstall</a></li>
<li><a href="#web-page-and-bug-reporting">Web Page and Bug Reporting</a></li>
</ul></li>
<li><a href="#ch:CA">Cellular Automata</a><ul>
<li><a href="#sec:CAInformaDef">Informal Definition of Cellular Automata</a></li>
<li><a href="#formal-definition-of-cellular-automata">Formal Definition of Cellular Automata</a></li>
<li><a href="#some-applications-of-cellular-automata">Some Applications of Cellular Automata</a></li>
<li><a href="#extended-cellular-automata">eXtended Cellular Automata</a><ul>
<li><a href="#formal-definition-of-extended-cellular-automata">Formal Definition of eXtended Cellular Automata</a></li>
</ul></li>
</ul></li>
<li><a href="#ch:opencal">OpenCAL</a><ul>
<li><a href="#sec:cal_life">Conway’s Game of Life</a></li>
<li><a href="#opencal-statement-convention">OpenCAL statement convention</a></li>
<li><a href="#sec:CustomNeiughbourhoods">Custom Neighbourhoods</a></li>
<li><a href="#sec:sciddicaT">SciddicaT</a></li>
<li><a href="#sec:sciddicaT_active">SciddicaT with active cells optimization</a></li>
<li><a href="#sec:sciddicaT_extended">SciddicaT as eXtended CA</a></li>
<li><a href="#sciddicat-with-explicit-simulation-loop">SciddicaT with explicit simulation loop</a></li>
<li><a href="#sec:mod2">A three-dimensional example</a></li>
<li><a href="#combining-opencal-and-opencal-gl">Combining OpenCAL and OpenCAL-GL</a><ul>
<li><a href="#implementing-sciddicat-in-opencal-and-opencal-gl">Implementing SciddicaT in OpenCAL and OpenCAL-GL</a></li>
<li><a href="#implementing-the-mod2-ca-in-opencal-and-opencal-gl">Implementing the <em>mod2</em> CA in OpenCAL and OpenCAL-GL</a></li>
</ul></li>
<li><a href="#opencal-global-functions">OpenCAL Global Functions</a></li>
</ul></li>
<li><a href="#ch:opencal-omp">OpenCAL OpenMP version</a><ul>
<li><a href="#conways-game-of-life-in-opencal-omp">Conway’s Game of Life in OpenCAL-OMP</a></li>
<li><a href="#sciddicat">SciddicaT</a></li>
<li><a href="#sciddicat-with-active-cells-optimization">SciddicaT with active cells optimization</a></li>
<li><a href="#sciddicat-as-extended-ca">SciddicaT as eXtended CA</a></li>
<li><a href="#sciddicat-with-explicit-simulation-loop-1">SciddicaT with explicit simulation loop</a></li>
<li><a href="#sciddicat-computational-performance">SciddicaT computational performance</a></li>
<li><a href="#a-three-dimensional-example">A three-dimensional example</a></li>
</ul></li>
<li><a href="#ch:opencal-cl">OpenCAL OpenCL version</a><ul>
<li><a href="#sec:openclstructure">OpenCL framework</a></li>
<li><a href="#the-structure-of-opencal-cl">The structure of OpenCAL-CL</a></li>
<li><a href="#host-programming">Host programming</a><ul>
<li><a href="#definition-of-the-model">Definition of the model</a></li>
<li><a href="#manage-of-the-devices">Manage of the devices</a></li>
<li><a href="#allocation-of-kernels">Allocation of kernels</a></li>
<li><a href="#send-data-from-the-model-to-devices">Send data from the model to devices</a></li>
<li><a href="#start-execution-loop">Start execution loop</a></li>
</ul></li>
<li><a href="#kernel-programming">Kernel programming</a></li>
<li><a href="#conways-game-of-life-with-opencal-cl">Conway’s Game of Life with OpenCAL-CL</a></li>
</ul></li>
</ul>
</div>
<p><br />
</p>
<p><br />
</p>
<p><br />
</p>
<p><br />
</p>
<h1 id="acknowledgements" class="unnumbered">Acknowledgements</h1>
<p><em>……</em></p>
<h1 id="introduction">Introduction</h1>
<p>Cellular Automata (CA) represent a parallel computing methodology for modelling complex systems. Well known examples of applications include the simulation of natural phenomena such as lava and debris flows, forest fires, agent based social processes such as pedestrian evacuation and highway traffic problems, besides many others (e.g., theoretical studies).</p>
<p>Many Cellular Automata software environments and libraries exist. However, when non-trivial modelling is needed, only non-open source software are generally available. This is particularly true for eXtended Cellular Automata (XCA), adopted for simulating phenomena at a macroscopic point of view, for which only a significant example of non free software exists, namely the CAMELot Cellular Automata Simulation Environment.</p>
<p>In order to fill this deficiency in the world of free software, the <code>OpenCAL</code> C Library has been developed. Similarly to CAMELot, it allows for a simple and concise definition of both the transition function and the other characteristics of the cellular automaton definition. Moreover, it allows for both sequential and parallel execution, both on CPUs and GPUs (thanks to the adoption of the OpenMP and OpenCL APIs, respectively), hiding most parallel implementation issues to the user.</p>
<p>The library has been tested on both CPUs anf GPUs by considering different Cellular Automata, including the well known Conway’s Game of Life and the SciddicaT XCA debris flows simulation model. Results have demonstrated the goodness the library both in terms of usability and performance.</p>
<p>In the present release 1.0 of the library, 2D and 3D cellular automata can be defined<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a>. The library also offers diverse facilities (e.g. it provides many pre-defined cell’s neighborhoods), allows to explicate the simulation main cycle and provides a built in optimization algorithm to speed up your simulation. Moreover, OpenCAL offers you a built in interactive 2D/3D visualization system you can use to graphically represent step by step your simulation based. The visualization system was developed in OpenGL Compatibility Profile, thus is able to run everywhere, also on old workstations.</p>
<p>The present manual reports the main usage of the <code>OpenCAL</code> library related to the sequential, OpenMP and OpenCL versions, the library installation procedure, besides diverse examples of application. In particular, Chapter [ch:installation] deals with OpenCAL download and installation, while Chapter [ch:CA] introduce you to the CA and XCA computational paradigms. If you already know theese topics, you can skip this Chapter. Chapter [ch:opencal] is about serial CA and XCA development with OpenCAL, and introduce the different library features by examples. Chapter [ch:opencal-omp] is about the OpenMP-based parallel version of OpenCAL and also introduces the library by examples. Chapter [ch:opencal-cl] briefly introduces you to General Purpose GPU programmin with OpenCL and then presents you the OpenCL-based version of OpenCAL, still by examples. OpenCAL-GL, whihch is the name that identifies the OpenGL visualization system built into OpenCAL, is introduced at the end of each of the above Chapters, together with computational performaces of some of the implemented CA. Eventually, Chapter [ch:utility] ends this user guide by presenting you some usefull library features that weren’t presented previously, like reduction functions.</p>
<h1 id="quick-start">Quick Start</h1>
<h1 id="ch:installation">Installation</h1>
<p>The 1.0 release of <code>OpenCAL</code>, here presented, is a collection of four different software libraries. Under the name <code>OpenCAL</code> we identify the serial version of the library. It comes together with two different parallel implementations based on OpenMP and OpenCL, namely OpenCAL-OMP and OpenCAL-CL, respectively. Moreover, OpenCAL-GL identifies a general purpose OpenGL/GLUT visualization library. Many examples are also included into the distribution.</p>
<p>The library can be obtained as source code from GitHub to be compiled in your system. You can chose if compiling all the four different implementations or only some of them, and also if compling the examples. Some dependencies must be satisfied, depending on what you want to compile. Libraries can be compiled both statically (as .lib files) and as shared objects (as .so files). Eventually, you can install headers and libraries in your system and also uninstall them if no longer needed.</p>
<p>In the following Sections we will see all the steps needed to obtain the software and make it working on your computer.</p>
<h2 id="requirements-and-dependencies">Requirements and dependencies</h2>
<p>In order to compile OpenCAL, you essentially need CMake (version 2.8 or greater) and an ANSI C compiler (e.g. gcc 4.9 or greater<a href="#fn2" class="footnoteRef" id="fnref2"><sup>2</sup></a>) installed in your system. CMake is used to generate the makefiles (or even project files for several IDEs) to be used to build OpenCAL.</p>
<p>An OpenCL implementation (e.g. by NVIDIA, AMD, Intel) is also needed to compile OpenCAL-CL. Moreover, still to compile OpenCAL-CL, at least the 3.1 version of CMake is required. Eventually, a GLUT development library is needed to compile OpenCAL-GL. The following is a dependencies list for each of the above software libraries:</p>
<dl>
<dt>OpenCAL: </dt>
<dd><p>CMake version 2.8 or grater, and a quite new C compiler.</p>
</dd>
<dt>OpenCAL-OMP: </dt>
<dd><p>A C compiler supporting at least Open-MP version 2.0<a href="#fn3" class="footnoteRef" id="fnref3"><sup>3</sup></a>.</p>
</dd>
<dt>Open-CL: </dt>
<dd><p>CMake version 3.1 or grater and at least one OpenCL implementation installed in your system<a href="#fn4" class="footnoteRef" id="fnref4"><sup>4</sup></a>.</p>
</dd>
<dt>OpenCAL-GL: </dt>
<dd><p>OpenGL/GLUT headers and libraries <a href="#fn5" class="footnoteRef" id="fnref5"><sup>5</sup></a>;</p>
</dd>
</dl>
<p>Eventually, Doxygen and Graphviz are required to build the software documentation, which provides specific information about library’s data structures and functions.</p>
<h2 id="obtaining-opencal">Obtaining OpenCAL</h2>
<p>OpenCAL is available as source code you can downoad from GitHub and build into your own system. We suggest you to download the stable software release at the following GitHub url:</p>
<p><a href="https://github.com/OpenCALTeam/opencal/archive/OpenCAL-1.0.zip" class="uri">https://github.com/OpenCALTeam/opencal/archive/OpenCAL-1.0.zip</a></p>
<p>Another (not recommended) option is to dowload the development release. It can be obtained as compressed archive at <a href="https://github.com/OpenCALTeam/opencal/archive/master.zip" class="uri">https://github.com/OpenCALTeam/opencal/archive/master.zip</a>, or by <em>cloning</em> the Git repository through the following commands:</p>
<div class="sourceCode" numbers="none" language="bash"><pre class="sourceCode bash"><code class="sourceCode bash"><span class="kw">user@machine</span>:-$ cd <span class="kw"><</span>gitwd<span class="kw">></span>
<span class="kw">user@machine</span>:-$ git clone https://github.com/OpenCALTeam/OpenCAL
<span class="kw">user@machine</span>:-$ cd opencal</code></pre></div>
<p>Here <code><gitwd></code> represents the directory in your file system containing your git respositories. If you don’t have a git working directory, we suggest you to create one.</p>
<h2 id="structure-of-the-distribution-directory">Structure of the Distribution Directory</h2>
<p>The software distribution contains the following files and directories:</p>
<dl>
<dt>AUTHORS - </dt>
<dd><p>OpenCAL Authors list;</p>
</dd>
<dt>OpenCAL - </dt>
<dd><p>Core and examples code of the <em>serial</em> implementation;</p>
</dd>
<dt>OpenCAL-CL - </dt>
<dd><p>Core and examples code of the Open-CL-based parallel implementation;</p>
</dd>
<dt>OpenCAL-GL - </dt>
<dd><p>Graphic OpenGL/GLUT-based visualization system library and examples;</p>
</dd>
<dt>OpenCAL-OMP - </dt>
<dd><p>Core and examples code of the Open-MP-based parallel implementation;</p>
</dd>
<dt>CMakeLists.txt - </dt>
<dd><p>CMake configuration file;</p>
</dd>
<dt>cmake_uninstall.cmake.in - </dt>
<dd><p>Uninstall script;</p>
</dd>
<dt>configure.cmake - </dt>
<dd><p>File needed by (i.e. included in) CMakeLists.txt;</p>
</dd>
<dt>LICENSE</dt>
<dd><p>The GNU LGPL licence;</p>
</dd>
<dt>Other minor files - </dt>
<dd><p>Other minor files can also be included into the distribution.</p>
</dd>
</dl>
<h3 id="generating-makfiles">Generating makfiles</h3>
<p>Once you have satisfied all necessary dependencies it is time to generate the makefiles (or the project files or for your development environment) needed to build the software distribution. CMake needs to know two paths for this:</p>
<ol>
<li><p>The path to the OpenCAL source tree root directory;</p></li>
<li><p>The target path for the generated files and compiled binaries.</p></li>
</ol>
<p>If these paths are the same, we are in front of an in-tree build, otherwise of an out-of-tree build. We strongly suggest to do an out-of-tree build. One of several advantages of out-of-tree builds is that you can generate files and compile for different development environments using a single source tree.</p>
<p>To make an out-of-tree build you need to:</p>
<ol>
<li><p>Enter the source tree root directory (e.g. <code><gitwd>/opencal/</code>);</p></li>
<li><p>Create a directory for the binaries (e.g. <code><gitwd>/opencal/build/</code>) and enter into it;</p></li>
<li><p>Run CMake using zero or more of the options listed in Table [ch:installation:cmakeoptions] to control which features will be enabled in the compiled library. The current directory is used as target path, while the path provided as an argument is used to find the source tree.</p></li>
</ol>
<p>If you want to build everything (serial and parallel libraries, examples and documentation), you can use the following commands:</p>
<pre><code> user@machine:-$ cd <gitwd>/opencal/
user@machine:-$ mkdir build && cd build
user@machine:-$ cmake ../ -DBUILD_OPENCAL_SERIAL=ON \
-DBUILD_OPENCAL_OMP=ON \
-DBUILD_OPENCAL_CL=ON \
-DBUILD_OPENCAL_GL=ON \
-DBUILD_EXAMPLES=ON \
-DBUILD_DOCUMENTATION=ON \
-DENABLE_SHARED=ON</code></pre>
<p>Each CMake option corresponds to a target. If you are not interested to some of them, simply switch off the corresponding option. For instace, if you set the <code>ENABLE_SHARED</code> option to <code>OFF</code> (that, indeed, is the default value), generated makefiles will be set to build static libraries (.a files under Linux systems) instead of shared objects (.so files). If you omit a CMake option, the default value will be assumed (cf. Table [ch:installation:cmakeoptions]).</p>
<p><span>lXl</span> <strong>OPTION</strong> & <strong>EFFECT</strong> & <strong>DEFAULT</strong><br />
<code>BUILD_OPENCAL_SERIAL</code> & Build the OpenCAL serial version & ON<br />
<code>BUILD_OPENCAL_OMP</code> & Build the OpenCAL-OMP OpenMP parallel version (OpenMP required) & OFF<br />
<code>BUILD_OPENCAL_CL</code> & Build the OpenCAL-CL OpenCL parallel version (OpenCL required) &OFF<br />
<code>BUILD_OPENCAL_GL</code> & Build the OpenCAL-GL visualization library (OpenGL and GLUT required) &OFF<br />
<code>BUILD_EXAMPLES</code> & Build the examples for each OpenCAL version &<br />
<code>BUILD_DOCUMENTATION</code> & Build the HTML based API documentation (Doxygen and Graphviz required) & OFF<br />
<code>ENABLE_SHARED</code> & Controls whether the library should be compiles as shared object (.so). If OFF, static objects (.a) will be produced & OFF<br />
</p>
<h2 id="build-and-installuninstall">Build and Install/Uninstall</h2>
<p>Once makefiles have been produced by CMake, everything is set up and ready for compiling. To build use the following command:</p>
<pre><code> user@machine:-$ make -jn</code></pre>
<p>where <code>n</code> is the number of cores of your machine you want to use for speeding up the compilation process.</p>
<p>You can install the compiled objects (libraries and examples - if enabled during the CMake configuration), headers and API documentation in the appropriate folders using the following command:</p>
<pre><code> user@machine:-$ sudo -
root@machine:-$ make install</code></pre>
<p>or equivalently, if your user is in the <code>sudoers</code> list</p>
<pre><code> user@machine:-$ sudo make install</code></pre>
<p>By default, files are installed in ... For instance, under Debian, file are installed in... If you want to change the default settings...</p>
<pre><code> <QUI BISOGNA SPIEGARE COSA SI DEVE FARE PER CAMBIARE IL TARGET PER IL
MAKE INSTALL></code></pre>
<p>If you want to uninstall OpenCAL for some reason, you can simply enter the build directory and call make with the uninstall target, as in the following:</p>
<pre><code> user@machine:-$ sudo make uninstall</code></pre>
<h2 id="web-page-and-bug-reporting">Web Page and Bug Reporting</h2>
<p>The Web page for OpenCAL is at <a href="http://opencal.telesio.unical.it" class="uri">http://opencal.telesio.unical.it</a> and contains up-to-date news and a list of bug reports. <span>OpenCAL</span>’s GitHub homepage is at <a href="https://github.com/OpenCALTeam/opencal" class="uri">https://github.com/OpenCALTeam/opencal</a>. For further information or bug reports contact <script type="text/javascript">
<!--
h='telesio.unical.it';a='@';n='opencal';e=n+a+h;
document.write('<a h'+'ref'+'="ma'+'ilto'+':'+e+'" clas'+'s="em' + 'ail">'+'mailto:opencal@telesio.unical.it'+'<\/'+'a'+'>');
// -->
</script><noscript>mailto:opencal@telesio.unical.it (opencal at telesio dot unical dot it)</noscript> or use the submit an issue at the following url <a href="https://github.com/OpenCALTeam/opencal/issues" class="uri">https://github.com/OpenCALTeam/opencal/issues</a>.</p>
<p>When reporting a bug, please include as much information and documentation as possible. Helpful information would include OpenCAL version, OpenMP/OpenCL implementation and version used, configuration options, type of computer system, problem description, and error message output.</p>
<h1 id="ch:CA">Cellular Automata</h1>
<p>Cellular Automata (CA) are parallel computing models, whose evolution is ruled by local rules. They are largely employed in Science and Engineering for a wide range of problems, especially when classic methods (e.g., based on differential equations) can not be conveniently applied. In this Chapter, CA are briefly introduced, together with a recent extension of them, known as eXtended Cellular Automata (XCA), which are widely used for the modelling of physical extended systems.</p>
<h2 id="sec:CAInformaDef">Informal Definition of Cellular Automata</h2>
<p>A cellular automaton can be thought as a <span class="math inline">\(d\)</span>-dimensional space, called <em>cellular space</em>, subdivided in regular cells of uniform shape and size. Each cell embeds a <em>finite automaton</em>, one of the most simple and well known computational models, which can assume a finite number of states. At time <span class="math inline">\(t=0\)</span>, cells are in arbitrary states and the CA evolves step by step by changing the states of the cells at discrete time steps, by applying the same local rule of evolution, i.e. the cell’s <em>transition function</em>, simultaneously (i.e. in parallel) to each cell of the CA. Input for the cell is given by the states of a predefined (small) set of neighboring cells, which is assumed constant in space and time. It is possible to identify an informal definition of cellular automaton by simply listing its main properties:</p>
<ul>
<li><p>It is formed by a <span class="math inline">\(d\)</span>-dimensional space (i.e. the <em>cellular space</em>), partitioned into cells of uniform shape and size (e.g. triangles, squares, hexagons, cubes, etc. - see Figure [fig:cellularspaces]);</p></li>
<li><p>The number of cell’s states is finite;</p></li>
<li><p>The evolution occurs through discrete steps;</p></li>
<li><p>Each cell changes state simultaneously to each other (i.e. they change state concurrently, in parallel) thanks to the application of the cell’s <em>transition function</em>;</p></li>
<li><p>The cell’s state transition depends on the states of a set of neighboring cells;</p></li>
<li><p>The evolving cell is called <em>central cell</em> and can belong to its neighbourhood;</p></li>
<li><p>The neighboring relationship is local, uniform and invariant over time (see Figures [fig:1Dneighborhood] and [fig:2Dneighborhood] for examples of 1D and 2D neighborhoods, respectively).</p></li>
</ul>
<div class="figure">
<img src="./images/CellularAutomata/cellularspaces.png" alt="Example of cellular spaces: (a) one-dimensional, (b) two-dimensional with square cells, (c) two-dimensional with hexagonal cells, (d) three-dimensional with cubic cells." width="453" />
<p class="caption">Example of cellular spaces: (a) one-dimensional, (b) two-dimensional with square cells, (c) two-dimensional with hexagonal cells, (d) three-dimensional with cubic cells.<span data-label="fig:cellularspaces"></span></p>
</div>
<div class="figure">
<img src="./images/CellularAutomata/onedimensional.png" alt="Example of neighborood with radius (a) r = 1 and (b) r
= 2 for one-dimensional Cellular Automata. Central cell is represented in dark gray, while adjacent cells are in light gray. Note that the central cell can optionally belong to its own neighborhood." width="453" />
<p class="caption">Example of neighborood with radius (a) <span class="math inline">\(r = 1\)</span> and (b) <span class="math inline">\(r
= 2\)</span> for one-dimensional Cellular Automata. Central cell is represented in dark gray, while adjacent cells are in light gray. Note that the central cell can optionally belong to its own neighborhood.<span data-label="fig:1Dneighborhood"></span></p>
</div>
<div class="figure">
<img src="./images/CellularAutomata/2Dneighborhoods.png" alt="Examples of von Neumann (a) and Moore (b) neighboroods for two-dimensional CA with square cells. Examples of (Moore) neighborhoods are also shows for hexagonal CA, both for the cases of horizontal (c) and vertical (d) cells orientation. Central cell is represented in dark gray, while adjacent cells are in light gray. Note that the central cell can optionally belong to its own neighborhood." width="453" />
<p class="caption">Examples of von Neumann (a) and Moore (b) neighboroods for two-dimensional CA with square cells. Examples of (Moore) neighborhoods are also shows for hexagonal CA, both for the cases of horizontal (c) and vertical (d) cells orientation. Central cell is represented in dark gray, while adjacent cells are in light gray. Note that the central cell can optionally belong to its own neighborhood.<span data-label="fig:2Dneighborhood"></span></p>
</div>
<p>Despite their simple definition, CA can exhibit very interesting complex global behaviors. Moreover, from a computational point of view, they are equivalent to Turing Machines. This means, in principle, that everything that can be computed can also be by means of a cellular automaton (Church–Turing thesis). Thanks to their <em>computational universality</em>, CA gained a great consideration inside the Scientific Community and were employed for solving a great variety of different complex problems.</p>
<h2 id="formal-definition-of-cellular-automata">Formal Definition of Cellular Automata</h2>
<p>Cellular Automata are formally defined as a quadruple:</p>
<p><span class="math display">\[A = <R,X,Q,\sigma>\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R = \{i = (i_0,i_1,....,i_{d-1}) \; | \; i_k \in \mathbb{Z} \;\; \forall k =
0,1,...,d-1\}\)</span> is the set of points, with integer coordinates, which defines the <span class="math inline">\(d\)</span>-dimensional cellular space;</p></li>
<li><p><span class="math inline">\(X = \{\xi_0,\xi_1\,...\xi_{m-1}\}\)</span> is the finite set of <span class="math inline">\(m\)</span> <span class="math inline">\(d\)</span>-dimensional vectors <span class="math display">\[\xi_j = \{\xi_{j_0},\xi_{j_1},...\xi_{j_{d-1}}\}\]</span> that define the set <span class="math display">\[N(X,i) = \{i + \xi_0,i + \xi_1,...,i + \xi_{m-1}\}\]</span> of coordinates of cells belonging the the central cell’s neighborhood (i.e. the cell with coordinates <span class="math inline">\(i =
(i_1,i_2,...i_d)\)</span>). In other words, <span class="math inline">\(X\)</span> represents the geometrical pattern that specifies the neighbourhood relationship;</p></li>
<li><p><span class="math inline">\(Q\)</span> is the finite set of states of the cell;</p></li>
<li><p><span class="math inline">\(\sigma : Q^m \rightarrow Q\)</span> is the cell’s transition function.</p></li>
</ul>
<h2 id="some-applications-of-cellular-automata">Some Applications of Cellular Automata</h2>
<p>CA are particularly suited to model and simulate classes of complex systems characterized by a large number of interacting elementary components. The assumption that if a system behaviour is complex, the model that describes it must necessarily be of the same complexity is replaced by the idea that its behavior can be described, at least in some cases, in very simple terms.</p>
<p>Among different fields, fluid-dynamics is one of most important applications for CA and, in this research branch, many different CA-based methods can be found in the literature to simulate fluid flows. For instance, Lattice Gas Automata (LGA) were introduced for describing the motion and collision of <em>particles</em> on a grid and it was shown that such models can reproduce the main fluid dynamical properties. The continuum limit of these models leads to the Navier-Stokes equations. Lattice Gas models can be regarded as microscopic models, as they describe the motion of fluid particles which interact by scattering. An advantage of Lattice Gas models is that the simplicity of particles, and of their interactions, allow for the simulation of a large number of them, making it therefore possible to observe the emergence of flow patterns. Furthermore, since they are CA systems, it is possible to easily run simulations in parallel. A different approach to LGA is represented by Lattice Boltzmann models in which the state variables can take continuous values, as they are supposed to represent the density of fluid particles, endowed with certain properties, located in each cell (space and time are also discrete, as in lattice gas models). Both Lattice Gas and Lattice Boltzmann Models have been applied for the description of fluid turbulence.</p>
<p>Since many complex natural phenomena evolve on very large areas, they are therefore difficult to be modelled at a microscopic level of description. Among these, we can find some real flow-type phenomena like debris and lava flows, as well as floods and pyroclastic flows. Besides rheological complex behaviour, they generally evolve on complex topographies, that can even change during the phenomenon evolution, and are often characterised by branching and rejoining of the flow. In order to better model such kind of phenomena, an extended notion of Cellular Automata (eXtended Cellular Automata, described in the next Section), can represent a valid alternative to classical CA.</p>
<h2 id="extended-cellular-automata">eXtended Cellular Automata</h2>
<p>As regards the modeling of natural complex phenomena, Prof. Gino ‘Miracle Crisci’ and co-workers from University of Calabria (Italy) proposed a method based on an eXtended notion of CA (XCA), firstly applied to the simulation of basaltic lava flows in the 80’s<a href="#fn6" class="footnoteRef" id="fnref6"><sup>6</sup></a>. It was shown that the approach behind XCA can greatly make more straightforward the modeling of some complex systems.</p>
<p>Informally, XCA, compared to classical CA, are different because of the following reasons:</p>
<ul>
<li><p>The cell’s state is decomposed in <em>substates</em>, each of them representing the set of admissible values of a given characteristic assumed to be relevant for the modeled system and its evolution (e.g., lava temperature, lava thickness, etc, in the case of ca lava flow model). The set of states for the cell is simply obtained as the Cartesian product of the considered substates.</p></li>
<li><p>As the cell’s state can be decomposed in substates, also the transition function can be split into <em>elementary processes</em>, each of them representing a particular aspect that rules the dynamic of the considered phenomenon. In turn, <em>elementary processes</em> can be split into <em>local interactions</em>, which refer to rules that deal with interactions among substates of the cell with neighbor ones (e.g., mass exchange with neighbours) and <em>internal transformations</em>, defined as the changes in the values of the substates due only to interactions among substates inside the cell (e.g. the solidi¢cation of the lava inside the cell due to the substate ‘lava temperature’).</p></li>
<li><p>A set of <em>parameters</em>, commonly used to characterize the dynamic behaviors of the considered phenomenon, can be defined.</p></li>
<li><p>Global operations can also be allowed (e.g. to model external influences that can not easily be described in terms of local interactions, or to perform reductions over the whole, or a subset of, the cellular space). Such operations go under the name of <em>steering</em>.</p></li>
</ul>
<h3 id="formal-definition-of-extended-cellular-automata">Formal Definition of eXtended Cellular Automata</h3>
<p>Formally, a XCA is defined as a 7-tuple:</p>
<p><span class="math display">\[A = <R,X,Q,P,\sigma,\Gamma,\gamma>\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R\)</span> is the <span class="math inline">\(d\)</span>-dimensional cellular space.</p></li>
<li><p><span class="math inline">\(X\)</span> is the geometrical pattern that specifies the neighbourhood relationship; <span class="math inline">\(m = |X|\)</span> represent the number of elements in the set <span class="math inline">\(X\)</span>, i.e. the number of neighbors for the central cell.</p></li>
<li><p><span class="math inline">\(Q = Q_0 \times Q_1 \times....\times Q_{n-1}\)</span> is the set of cell’s states, expressed as Cartesian product of the <span class="math inline">\(n\)</span> considered <em>substates</em> <span class="math inline">\(Q_0 \times Q_1 \times....\times Q_{n-1}\)</span>.</p></li>
<li><p><span class="math inline">\(P = {p_0,p_1,....,p_{p-1}}\)</span> is the set of CA <em>parameters</em>.They can allow a fine tuning of the XCA model, with the purpose of reproducing different dynamical behaviours of the phenomenon of interest.</p></li>
<li><p><span class="math inline">\(\sigma : Q^m \rightarrow Q\)</span> is the cell’s transition function. It is splitted in <span class="math inline">\(s\)</span> <em>elementary processes</em>, <span class="math inline">\(\sigma_0,\sigma_1, ...,
\sigma_{s-1}\)</span>, each one describing a particular aspect ruling the dynamic of the considered system.</p></li>
<li><p><span class="math inline">\(\Gamma \subseteq R\)</span> is the region over which the steering function is applied.</p></li>
<li><p><span class="math inline">\(\gamma: Q^{|\Gamma|} \rightarrow Q^{|\Gamma|} \times
\mathbb{R}\)</span> is the (global) steering functions.</p></li>
</ul>
<p>In the next Chapters, some examples of XCA will be presented. Their implementations in OpenCAL will also be described, both in serial (Chapter [ch:opencal]) and in parallel (Chapters [ch:opencal-omp] and [ch:opencal-cl]).</p>
<h1 id="ch:opencal">OpenCAL</h1>
<p>With OpenCAL (without any suffix) we identify the sequential version of the OpenCAL software library, which runs on just a single core of your CPU, and represents the basis for the other parallel versions, namely OpenCAL-OMP, based on OpenMP, and OpenCAL-CL, based on OpenCL (and thus able to run on GPUs). Moreover, OpenCAL allows for some <em>unsafe operations</em>, which can significantly speed up your application. Such unsafe operations can also be found in the OpenMP version, while they are not currently available in OpenCAL-CL.</p>
<p>In the following sections, we will introduce OpenCAL by examples. In the first part of the Chapter, we will deal with the OpenCAL’s <em>safe mode</em>, while in the last one, we will go deep inside OpenCAL, discussing <em>unsafe mode</em> operations.</p>
<h2 id="sec:cal_life">Conway’s Game of Life</h2>
<p>In order to introduce you to Cellular Automata development with OpenCAL, we start this section by implementing the Conway’s Game of Life. It represents one of the most simple, yet powerful examples of Cellular Automata, devised by mathematician John Horton Conway in 1970.</p>
<p>The Game of Life can be thought as an infinite two-dimensional orthogonal grid of square cells, each of which is in one of two possible states, <em>dead</em> or <em>alive</em>. Every cell interacts with the eight adjacent neighbors <a href="#fn7" class="footnoteRef" id="fnref7"><sup>7</sup></a> belonging to the Moore neighborhood. At each time step, one of the following transitions occur:</p>
<ol>
<li><p>Any live cell with fewer than two alive neighbors dies, as if by loneliness.</p></li>
<li><p>Any live cell with more than three alive neighbors dies, as if by overcrowding.</p></li>
<li><p>Any live cell with two or three alive neighbors lives, unchanged, to the next generation.</p></li>
<li><p>Any dead cell with exactly three live neighbors comes to life.</p></li>
</ol>
<p>The initial configuration of the system specifies the state (dead or alive) of each cell into the cellular space. The evolution of the system is thus obtained by applying the above rules (which define the cell’s transition function) simultaneously to every cell in the cellular space, so that each new configuration is function of the one at the previous step. The rules continue to be applied repeatedly to create further generations. For more details on the Game of Life, please check Wikipedia at the URL <a href="http://en.wikipedia.org/wiki/Conway's_Game_of_Life" class="uri">http://en.wikipedia.org/wiki/Conway's_Game_of_Life</a>.</p>
<div class="figure">
<img src="./images/OpenCAL/LifeNeighborhood.png" alt="OpenCAL’s 2D cellular space and Moore neighborhood. Cells are individuated by a couple of matrix-style integer coordinates (i, j), where i represents the row and j the column. Cell (0,0) is the one located at the upper-left corner. Moore neighborhood is represented in gray scale, with the central cell highlighted in dark gray. Neighboring cells can also be indexed by the integer subscript shown within the cell. Cells indices are implicitly assigned by OpenCAL, both in the case of predefined and custom neighborhoods. In the first case, indices can not be modified, while in the second case, indices are assigned progressively in an automatic way each time a new neighbor is added to the CA by means of the calAddNeighbor2D() function." width="188" />
<p class="caption">OpenCAL’s 2D cellular space and Moore neighborhood. Cells are individuated by a couple of matrix-style integer coordinates <span class="math inline">\((i, j)\)</span>, where <span class="math inline">\(i\)</span> represents the row and <span class="math inline">\(j\)</span> the column. Cell (0,0) is the one located at the upper-left corner. Moore neighborhood is represented in gray scale, with the central cell highlighted in dark gray. Neighboring cells can also be indexed by the integer subscript shown within the cell. Cells indices are implicitly assigned by OpenCAL, both in the case of predefined and custom neighborhoods. In the first case, indices can not be modified, while in the second case, indices are assigned progressively in an automatic way each time a new neighbor is added to the CA by means of the <code>calAddNeighbor2D()</code> function.<span data-label="fig:LifeNeighborhood"></span></p>
</div>
<p>The formal definition of the Life CA is reported below.</p>
<p><span class="math display">\[Life = < R, X, Q, \sigma >\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R\)</span> is the set of points, with integer coordinates, which defines the 2-dimensional cellular space. The generic cell in <span class="math inline">\(R\)</span> is individuated by means of a couple of integer coordinates <span class="math inline">\((i, j)\)</span>, where <span class="math inline">\(0 \leq i < i_{max}\)</span> and <span class="math inline">\(0 \leq j < j_{max}\)</span>. The first coordinate, <span class="math inline">\(i\)</span>, represents the row, while the second, <span class="math inline">\(j\)</span>, the column. The cell at coordinates <span class="math inline">\((0,0)\)</span> is located at the top-left corner of the computational grid (cf. Figure [fig:LifeNeighborhood]).</p></li>
<li><p><span class="math inline">\(X = \{(0,0), (-1, 0), (0, -1), (0, 1), (1, 0), (-1,-1), (1,-1),
(1,1), (-1,1)\}\)</span> is the Moore neighborhood relation, a geometrical pattern which identifies the cells influencing the state transition of the central cell. The neighborhood of the generic cell of coordinate <span class="math inline">\((i, j)\)</span> is given by <span class="math display">\[N(X, (i, j)) =\]</span> <span class="math display">\[= \{(i, j)+(0,0), (i, j)+(-1, 0), \dots, (i, j)+(-1,1)\} =\]</span> <span class="math display">\[= \{(i, j), (i-1, j), \dots, (i-1,j+1)\}\]</span> Here, a subscript operator can be used to index cells belonging to the neighbourhood. Let <span class="math inline">\(|X|\)</span> be the number of elements in X, and <span class="math inline">\(n \in
\mathbb{N}\)</span>, <span class="math inline">\(0 \leq n < |X|\)</span>; the notation</p>
<p><span class="math display">\[N(X, (i, j), n)\]</span></p>
<p>represents the <span class="math inline">\(n^{th}\)</span> neighbourhood of the cell <span class="math inline">\((i,j)\)</span>. Thereby, <span class="math inline">\(N(X, (i, j), 0) = (i, j)\)</span>, i.e. the central cell, <span class="math inline">\(N(X, (i, j), 1)
= (i-1, j)\)</span>, i.e. the first neighbour, and so on (cf. Figure [fig:LifeNeighborhood]).</p></li>
<li><p><span class="math inline">\(Q = \{0, 1\}\)</span> is the set of cell’s states.</p></li>
<li><p><span class="math inline">\(\sigma : Q^9 \rightarrow Q\)</span> is the deterministic cell transition function. It is composed by one elementary process, which implements the previously above mentioned transition rules.</p></li>
</ul>
<p>The program below shows a simple Game of Life sequential implementation in C with OpenCAL. As you can see, even if Listing [lst:cal_life] is very short, it completely defines the Conway’s Game of Life CA and performs a simulation (actually, only one computational step in this example).</p>
<p>In order to use OpenCAL, you need to include some header files (lines 3-5). Specifically, <code>cal2D.h</code> (line 3) allows you to define the CA object (line 9) and the related substate (line 10), while <code>cal2DRun.h</code> (line 5) allows you to define a CA simulation object (line 11), needed to run the CA model. The <code>cal2DIO.h</code> header file (line 4) provides you some basic input/output functions for reading/writing substates from/to file.</p>
<p>While statements at lines 9-11 just declare the required objects, they are defined later in the <code>main</code> function. In particular, the life CA object is defined at line 29 by the <code>calCADef2D()</code> function. The first 2 parameters define the CA dimensions (the number of rows and columns, respectively), while the third parameter defines the neighbourhood pattern. Here, the predefined Moore neighborhood is selected (cf. Figure [fig:LifeNeighborhood]), among those provided by OpenCAL. See Listings [lst:CALNeighborhood2D] and [lst:CALNeighborhood3D] for a list of OpenCAL’s predefined 2D and 3D neighborhoods, respectively. Custom neighborhoods can also be defined by means of the <code>calAddNeighbor2D()</code> function. In both cases, indices are assigned progressively, starting from 0, each time a new cell is added to the neighborhood.</p>
<p>The fourth parameter specifies the boundary conditions. In this case, the CA cellular space is considered as a torus, with cyclic conditions at boundaries. The last parameter allows you to specify if your model has to use the so called <em>active cells optimization</em>, that is able to restrict the computation to only <em>non-stationary cells</em>. In this case, no optimization is considered.</p>
<pre><code> struct CALModel2D* calCADef2D (
int rows,
int columns,
enum CALNeighborhood2D CAL_NEIGHBORHOOD_2D,
enum CALSpaceBoundaryCondition CAL_TOROIDALITY,
enum CALOptimization CAL_OPTIMIZATION
)</code></pre>
<p>The complete definition of <code>calCADef2D()</code> is provided in Listing [lst:calCADef2D], here reported together with few other OpenCAL key functions and data types. In particular, the <code>CALNeighborhood2D</code> enum type (Listing [lst:CALNeighborhood2D]) allows you to select one of the square or hexagonal predefined neighbourhoods, or a custom neighbourhood, whose pattern must be defined in the application. In particular, <code>CAL_VON_NEUMANN_NEIGHBORHOOD_2D</code> corresponds to the von Neumann neighbouring pattern, <code>CAL_MOORE_NEIGHBORHOOD_2D</code> to the Moore neighbouring pattern, <code>CAL_HEXAGONAL_NEIGHBORHOOD_2D</code> to the hexagonal neighbouring pattern, and <code>CAL_HEXAGONAL_NEIGHBORHOOD_ALT_2D</code> to the alternative hexagonal neighbouring pattern (cf. Figure [fig:2Dneighborhood], Section [sec:CAInformaDef]). As regards 3D neighbourhoods patterns, they are defined by means of the <code>CALNeighborhood3D</code> enum type (Listing [lst:CALNeighborhood3D]). Here, we can find the 3D equivalent versions of the von Neumann and Moore neighbourhoods, while hexagonal neighbourhoods are (obviously) not defined. Custom neighbourhoods will be discussed later in Section [sec:CustomNeiughbourhoods]. Similarly, the <code>CALSpaceBoundaryCondition</code> enum type (Listing [lst:CALSpaceBoundaryCondition]) allows you to set cyclicality condition at boundaries. Eventually, the <code>CALOptimization</code> enum type (Listing [lst:CALOptimization]) allows you to consider the <em>active cells optimization</em>, discussed later in this Chapter.</p>
<pre><code> enum CALNeighborhood2D {
CAL_CUSTOM_NEIGHBORHOOD_2D,
CAL_VON_NEUMANN_NEIGHBORHOOD_2D,
CAL_MOORE_NEIGHBORHOOD_2D,
CAL_HEXAGONAL_NEIGHBORHOOD_2D,
CAL_HEXAGONAL_NEIGHBORHOOD_ALT_2D
};</code></pre>
<pre><code> enum CALNeighborhood3D {
CAL_CUSTOM_NEIGHBORHOOD_3D,
CAL_VON_NEUMANN_NEIGHBORHOOD_3D,
CAL_MOORE_NEIGHBORHOOD_3D
};</code></pre>
<pre><code> enum CALSpaceBoundaryCondition{
CAL_SPACE_FLAT = 0,
CAL_SPACE_TOROIDAL
};</code></pre>
<pre><code> enum CALOptimization{
CAL_NO_OPT = 0,
CAL_OPT_ACTIVE_CELLS
};</code></pre>
<p>The CA simulation object is defined at line 30 by the <code>calRunDef2D()</code> function. The first parameter is a pointer to a CA object (<code>life</code> in our case), while the second and third parameters specify the initial and last simulation step, respectively. In this case, we just perform one step of computation, being both the first and last step set to 1. The last parameter allows you to specify the substate update policy. It can be implicit or explicit (Listing [lst:CALUpdateMode]). In the first case, OpenCAL does substates’ updates for you, while in the second case the substates’ updates is your responsibility. Note that, in case implicit update policy is applyied, all the CA substates are updated after the execution of each elementary process composing the CA transition function. We will discuss update policies later in this Chapter. The complete definition of <code>calRunDef2D()</code> is provided in Listing [lst:calRunDef2D()]. The <code>CALUpdateMode</code> type (Listing [lst:CALUpdateMode]) enumerates possible update policies.</p>
<pre><code> struct CALRun2D* calRunDef2D (
struct CALModel2D* ca2D,
int initial_step,
int final_step,
enum CALUpdateMode UPDATE_MODE
) </code></pre>
<pre><code> enum CALUpdateMode {
CAL_UPDATE_EXPLICIT = 0,
CAL_UPDATE_IMPLICIT
};</code></pre>
<p>Line 33 allocates memory and registers the <code>Q</code> substate to the <code>life</code> CA, while line 36 adds an elementary process to the cell transition function. The <code>calAddSubstate2Di()</code> function is very simple and self-explanatory. At the contrary, <code>calAddElementaryProcess2D()</code> must be discussed more in detail. It takes the handle to the CA model to which the elementary process must be attached and a pointer to a callback function, that defines the elementary process itself. In our example, we specified <code>life_transition_function</code> as second parameter, being it the name of a developer-defined function that you can find at lines 14-24. As you can see, the elementary process callback returns <code>void</code>. Moreover, it takes a pointer to a CA object as first parameter, followeb by a couple of integers, representing the coordinates of the generic cell in the CA space. This is the function prototype which is common to each elementary process. Note that, each elementary process is applyed by OpenCAL simultaneously to each cell of the cellular space in a computational step <a href="#fn8" class="footnoteRef" id="fnref8"><sup>8</sup></a>. However, this is completely transparent to the user, so that he/she can concentrate his/her effort on the definition of single cell behaviour.</p>
<p>When the user is going to implement an elementary process, by defining its callback function, he/she can rely on a set of OpenCAL functions that allow to get the substates values of both the central and the neighbouring cells, and to update the substates values of the central cell. In the specific case of the Game of Life, we used the <code>calGet2Di()</code> function to get the central cell’s value of the substate <code>Q</code> (remember that the central cell is identified by the coordinates (i, j), coming from the callback parameters), the <code>calGetX2Di()</code> function to get the value of the n-th neighbour’s substate <code>Q</code>, and the <code>calSet2Di()</code> function to update the value of the substate <code>Q</code> for the central cell. In the Game of Life example, we defined just one elementary process, that therefore represents the whole cell transition function. However, as we will see later, many elementary processes can be defined in OpenCAL by simply calling the <code>calAddElementaryProcess2D()</code> function many times. If you define more than one elementary process, they will be executed in the order they are added to the CA.</p>
<p>The <code>calInitSubstate2Di()</code> function at line 39 sets the whole substate <code>Q</code> to the value 0, i.e. the value of the substate <code>Q</code> is set to 0 in each cell of the cellular space. The following lines, from 42 to 46, set the value of the substate <code>Q</code> for some cells to 1, in order to define a well known <em>glider</em> pattern. In this case, we provided the cells coordinates as the third and fourth parameters. In this way, we define the initial condition of the system direcly inside the <code>main</code> function. However, as we will see later in this Chapter, such kind of initialization can be performed by means of a specific function.</p>
<p>The <code>calSaveSubstate2Di()</code> function (line 49) saves the substate <code>Q</code> to file, while the <code>calRun2D()</code> function (line 52) enters the simulation loop (actually, only one computational step in this example), and returns to the <code>main</code> function when the simulation is complete. The <code>calSaveSubstate2Di()</code> is thus called again at line 55 to save the new (last) configuration of the CA (represented by the only defined substate <code>Q</code>) to file, while the last two functions at lines 58 and 59 release previously allocated memory. The <code>return</code> statement at line 61 ends our first example.</p>
<p>Figures [fig:life_0000] and [fig:life_LAST] show the initial and final configuration of Game of Life as implemented in Listing [lst:cal_life], respectively.</p>
<div class="figure">
<img src="./images/OpenCAL/life_0000.png" alt="Initial configuration of Game of Life, as implemented in Listing [lst:cal_life]." width="264" />
<p class="caption">Initial configuration of Game of Life, as implemented in Listing [lst:cal_life].<span data-label="fig:life_0000"></span></p>
</div>
<div class="figure">
<img src="./images/OpenCAL/life_LAST.png" alt="Final configuration of Game of Life (actually, just one step of computation), as implemented in Listing [lst:cal_life]." width="264" />
<p class="caption">Final configuration of Game of Life (actually, just one step of computation), as implemented in Listing [lst:cal_life].<span data-label="fig:life_LAST"></span></p>
</div>
<h2 id="opencal-statement-convention">OpenCAL statement convention</h2>
<p>As you can easily see from a rapid sight to the source code, all the OpenCAL statements are characterized by a prefix and a suffix. All the data types have the <code>CAL</code> prefix, and an optional suffix that identifies the CA dimension (e.g. 2D for a two-dimensional model) and the basic type. For instance, in the case of the Life’s <code>Q</code> substate, the 2Di suffix of the <code>CALSubstate2Di</code> type specifies that it is a two-dimensional substate in which each element is of integer type.</p>
<p>More in detail, OpenCAL comes with three different basic numeric types, which are in lowercase (besides the prefix):</p>
<ul>
<li><p><code>CALbyte</code>, corresponding to the <code>char</code> C data type;</p></li>
<li><p><code>CALint</code>, corresponding to the <code>int</code> C data type;</p></li>
<li><p><code>CALreal</code>, corresponding to the <code>long double</code> C data type;</p></li>
</ul>
<p>while the possible substates types are:</p>
<ul>
<li><p><code>CALSubstate2Db</code>, corresponding to a <code>CALbyte</code> based substate;</p></li>
<li><p><code>CALSubstate2Di</code>, corresponding to a <code>CALint</code> based substate;</p></li>
<li><p><code>CALSubstate2Dr</code>, corresponding to a <code>CALreal</code> based substate.</p></li>
</ul>
<p>Also the OpenCAL constants have a prefix, namely the <code>CAL_</code> one, followd by at least one uppercase keyword. In case of more kewords, they are separated by the <code>_</code> character. Eventually, all the OpenCAL functions start with the <code>cal</code> suffix, followed by at least one capitalized keyword, and end with a suffix specifying the CA dimension and the basic datatype (e.g. the suffix <code>2Dr</code> is for a two-dimensionale real substate).</p>
<h2 id="sec:CustomNeiughbourhoods">Custom Neighbourhoods</h2>
<p>In the Game of Life example, we used the predefined Moore neighbourhood. As you already know, OpenCAL also provides other 2D and 3D predefined neighbourhoods. Furthermore, it allows for the definition of custom neighbourhood patterns (cf. Listings [lst:CALNeighborhood2D] and [lst:CALNeighborhood3D]).</p>
<pre><code> struct CALCell2D* calAddNeighbor2D(
struct CALModel2D* ca2D, // pointer to a 2D CA object
int i, // row coordinate
int j // column coordinate
);
struct CALCell3D* calAddNeighbor3D(
struct CALModel3D* ca3D, // pointer to a 3D CA object
int i, // row coordinate
int j, // column coordinate
int k // slice coordinate
);</code></pre>
<p>In order to define a custom neighbourhood pattern, you need to specify <code>CAL_CUSTOM_NEIGHBORHOOD_2D</code> (or <code>CAL_CUSTOM_NEIGHBORHOOD_3D</code> in case of a 3D CA) as the third parameter of the <code>calCADef2D()</code> function (<code>calCADef3D()</code> for a 3D CA) (cf. Listing [lst:calAddNeighbor2D-3D]). Doing this, you will start with an empty neighbouring pattern. Then call the <code>calAddNeighbor2D()</code> (<code>calAddNeighbor3D()</code> for 3D CA) function to add a cell to the pattern.</p>
<pre><code> // ...
int main ()
{
// define of the life CA and life_simulation simulation objects
life = calCADef2D (8, 16, CAL_MOORE_NEIGHBORHOOD_2D, CAL_CUSTOM_NEIGHBORHOOD_2D , CAL_NO_OPT );
//...
//add neighbors of the Moore neighborhood
calAddNeighbor2D(life, 0, 0); // neighbor 0 (central cell)
calAddNeighbor2D(life, - 1, 0); // neighbor 1
calAddNeighbor2D(life, 0, - 1); // neighbor 2
calAddNeighbor2D(life, 0, + 1); // neighbor 3
calAddNeighbor2D(life, + 1, 0); // neighbor 4
calAddNeighbor2D(life, - 1, - 1); // neighbor 5
calAddNeighbor2D(life, + 1, - 1); // neighbor 6
calAddNeighbor2D(life, + 1, + 1); // neighbor 7
calAddNeighbor2D(life, - 1, + 1); // neighbor 8
//...
}</code></pre>
<p>Listing [lst:CustomMooreExample] shows an example of how a custom neighbourhood pattern can be built for the Game of Life CA described above. In particular, the Moore neighburhood is built by using a sequence of nine calls to the <code>calAddNeighbor2D()</code> function. The first time the function is called, it adds the relative coordinates (0,0) to the neighbouring pattern. This first set of coordinates recerives the subscript identifier 0 and therefore can be used later to refer the central cell. For instance, when you call <code>calSet2Di(life,Q,i,j,0);</code> (see e.g. line 23, Listing [lst:cal_life]), the relative coordinates of the neighbor 0 (specified as last parameters), i.e. (0,0), are added to the coordinates of the cells the elementary process is applying to, i.e. (i,j) (cf. second and third to last parameters), by obtaining the cell (i,j) itself, which correspond to the central cell by definition. The subsequent calls to <code>calAddNeighbor2D()</code> add further couples of coordinates to the neighbouring pattern, by progressively assigning an integer subscript identifier to each of them.</p>
<p>Eventually, note that you are free to start from a predefined neighbourhood. For instance, let still consider the above Game of Life example; you can enrich the Moore neighbourhood pattern by simply specifying the value <code>CAL_MOORE_NEIGHBORHOOD_2D</code> as parameter for the <code>calCADef2D</code> function) and then add further neighbours by calling <code>calAddNeighbor2D()</code> as meny times the as the number of cells you want to add is.</p>
<h2 id="sec:sciddicaT">SciddicaT</h2>
<p>In the previous section we illustrated an OpenCAL implementation of a simple cellular automaton, namely the Conway’s Game of Life. Here, we will deal with a more complex example concerning the implementations of the SciddicaT Cellular Automata model for landslide simulation. Different versions will be presented, ranging from a naive to a fully optimized implementation.</p>
<p>Sciddica is a family of two-dimensional XCA (eXtended Cellular Automata - see Chapter [ch:CA]) debris flow models, successfully applied to the simulation of many real cases, such as the 1988 Mt. Ontake (Japan) landslide and the 1998 Sarno (Italy) disaster. An oversimplified toy-version of Sciddica (SciddicaT in the following) was here considered to be implemented in <code>OpenCAL</code>, and its application to the 1992 Tessina (Italy) landslide shown.</p>
<p>SciddicaT considers the surface over which the phenomenon evolves as subdivided in square cells of uniform size. Each cell changes its state by means of the transition function, which takes as input the state of the cells belonging to the von Neumann neighborhood. It is formally defined as:</p>
<p><span class="math display">\[SciddicaT = < R, X, Q , P, \sigma >\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R\)</span> is the set of points, with integer coordinates, which defines the 2-dimensional cellular space over which the phenomenon evolves. The generic cell in <span class="math inline">\(R\)</span> is individuated by means of a couple of integer coordinates <span class="math inline">\((i, j)\)</span>, where <span class="math inline">\(0 \leq i < i_{max}\)</span> and <span class="math inline">\(0 \leq j < j_{max}\)</span>. The firt coordinate, <span class="math inline">\(i\)</span>, represents the row, while the second, <span class="math inline">\(j\)</span>, the column. The cell at coodinates <span class="math inline">\((0,0)\)</span> is located at the top-left corner of the computational grid.</p></li>
<li><p><span class="math inline">\(X = \{(0,0), (-1, 0), (0, -1), (0, 1), (1, 0)\}\)</span> is the von Neumann neighborhood relation (cf. Figure [fig:2Dneighborhood]), a geometrical pattern which identifies the cells influencing the state transition of the central cell. The neighborhood of the generic cell of coordinate <span class="math inline">\((i, j)\)</span> is given by <span class="math display">\[N(X, (i, j)) =\]</span> <span class="math display">\[= \{(i, j)+(0,0), (i, j)+(-1, 0), (i, j)+(0, -1),
(i, j)+(0, 1), (i, j)+(1, 0)\} =\]</span> <span class="math display">\[= \{(i, j), (i-1, j), (i, j-1), (i, j+1), (i+1, j)\}\]</span></p>
<p>Here, a subscript operator can be used to index cells belonging to the neighbourhood. Let <span class="math inline">\(|X|\)</span> be the number of elements in X, and <span class="math inline">\(n \in
\mathbb{N}\)</span>, <span class="math inline">\(0 \leq n < |X|\)</span>; the notation</p>
<p><span class="math display">\[N(X, (i, j), n)\]</span></p>
<p>represents the <span class="math inline">\(n^{th}\)</span> neighbourhood of the cell <span class="math inline">\((i,j)\)</span>. Thereby, <span class="math inline">\(N(X, (i, j), 0) = (i, j)\)</span>, i.e. the central cell, <span class="math inline">\(N(X, (i, j), 1) = (i-1, j)\)</span>, i.e. the first neighbour, and so on.</p></li>
<li><p><span class="math inline">\(Q\)</span> is the set of cell states. It is subdivided in the following substates:</p>
<ul>
<li><p><span class="math inline">\(Q_z\)</span> is the set of values representing the topographic altitude (i.e. elevation);</p></li>
<li><p><span class="math inline">\(Q_h\)</span> is the set of values representing the debris thickness;</p></li>
<li><p><span class="math inline">\(Q_o^4\)</span> are the sets of values representing the debris outflows from the central cell to the neighboring ones.</p></li>
</ul>
<p>The Cartesian product of the substates defines the overall set of state <span class="math inline">\(Q\)</span>:</p>
<p><span class="math display">\[Q = Q_z \times Q_h \times Q_o^4\]</span> so that the cell state is specified by the following sixtuple:</p>
<p><span class="math display">\[q = (q_z, q_h, q_{o_0}, q_{o_1}, q_{o_2}, q_{o_3})\]</span> In particular, <span class="math inline">\(q_{o_0}\)</span> represents the outflows from the central cell towards the neighbour 1, <span class="math inline">\(q_{o_1}\)</span> the outflow towards the neighbour 2, and so on.</p></li>
<li><p><span class="math inline">\(P\)</span> is set of parameters ruling the CA dynamics:</p>
<ul>
<li><p><span class="math inline">\(p_\epsilon\)</span> is the parameter which specifies the thickness of the debris that cannot leave the cell due to the effect of adherence;</p></li>
<li><p><span class="math inline">\(p_r\)</span> is the relaxation rate parameter, which affects the size of outflows (cf. section above).</p></li>
</ul></li>
<li><p><span class="math inline">\(\sigma : Q^5 \shortrightarrow Q\)</span> is the deterministic cell transition function. It is composed by two elementary processes, listed below in the same order they are applied:</p>
<ul>
<li><p><span class="math inline">\(\sigma_1 : (Q_z \times Q_h)^5 \times p_\epsilon \times
p_r\shortrightarrow Q_o^4\)</span> determines the outflows from the central cell to the neighboring ones by applying the <em>minimization algorithm of the differences</em>. In brief, a preliminary control avoids outflows computation for those cells in which the amount of debris is smaller or equal to <span class="math inline">\(p_\epsilon\)</span>, acting as a simplification of the adherence effect. Thus, by means of the minimization algorithm, outflows <span class="math inline">\(q_o(0,m) \; (m=0,\ldots,3)\)</span> from the central cell towards its four adjecent cells are evaluated, and the <span class="math inline">\(Q_o^4\)</span> substates accordingly updated. Note that, <span class="math inline">\(q_o(0,0)\)</span> represents the aoutflow from the central cell towards the neighbour 1, <span class="math inline">\(q_o(0,1)\)</span> the aoutflow towards the neighbour 2, and so on. In general, <span class="math inline">\(q_o(0,m)\)</span> represnets the outflows from the central cell towards the <span class="math inline">\(n=(m+1)^{th}\)</span> neighbouring cell. Eventually, a relaxation rate factor, <span class="math inline">\(p_r \in \; ]0,1]\)</span>, is considered in order to obtain the local equilibrium condition in more than one CA step. This can significantly improve the realism of model as, in general, more than one step may be needed to displace the proper amount of debris from a cell towards the adjacent ones. In this case, if <span class="math inline">\(f(0,m) \; (i=0, \ldots, 3)\)</span> represent the outgoing flows towards the 4 adjacent cells, as computed by the minimization algorithm, the resulting outflows are given by <span class="math inline">\(q_o(0,m)=f(0,m) \cdot p_r \; (i=0, \ldots, 3)\)</span>.</p></li>
<li><p><span class="math inline">\(\sigma_2: Q_h \times (Q_o^4)^4 \shortrightarrow Q_h\)</span> determines the value of debris thickness inside the cell by considering mass exchange in the cell neighborhood: <span class="math inline">\(h'(0) = h(0) + \sum_{m=0}^3
(q_o(0,m) - q_o(m,0))\)</span>. Here, <span class="math inline">\(h'(0)\)</span> is the new debris thickness inside the cell, while <span class="math inline">\(q_o(m,0)\)</span> represents the inflow from the <span class="math inline">\(n=(m+1)^{th}\)</span> neighbouring cell. No parameters are involved in this elementary process.</p></li>
</ul></li>
</ul>
<p>In the following Listing [lst:cal_sciddicaT], an OpenCAL implementation of SciddicaT is shown.</p>
<p>As for the case of Game of Life, the CA model and the simulation objects are declared as global variables (lines 22 and 35, respectively), and defined later into the main function (lines 147 and 148, respevctively). As you can see, the 2D cellular space is a grid of <code>ROWS</code> rows times <code>COLS</code> columns cells, corresponding to <span class="math inline">\(i_{max}\)</span> and <span class="math inline">\(j_{max}\)</span> of the formal definition, respectively (cf. lines 10-11), while the von Neumann neighbourhood is adopted. The cellular space is still toroidal, as in Life, and no optimization is considered. Regarding the simulation object, a total of <code>STEPS</code> steps (i.e. 4000 steps - cf. line 14) are set, and implicit substates updating considered.</p>
<p>Substates and parameters are grouped into two different C structures (lines 24-28 and 30-33, respectively). Substates are therefore bound to the CA context by means of the <code>calAddSubstate2Dr()</code> function (lines 155-160), as well as elementary processes are defined as collback functions by means of the <code>calAddElementaryProcess2D()</code> function (lines 151-152).</p>
<p>The topographic altitude and debris thickness substates are initialized from files through the <code>calLoadSubstate2Dr()</code> function (lines 163-164), while the remaining initial state of the CA is set by means of the <code>calRunAddInitFunc2D()</code> function. It registers the <code>sciddicaT_simulation_init()</code> callback, whic is executed once before the execution of the simulation loop, in which the elementary processes are applied to the whole set of cells of the cellular space. Such a callback function must return void and take a pointer to a simulation obect as parameter. Differently to an elementary process, that can only access state values of cells belonging to the neighbourhood, this function can perform global operations on the whole cellular space. In the specific case of the SciddicaT model, the <code>sciddicaT_simulation_init()</code> function (lines 104-130) sets the values of all the outflows from the central cell to its neighbours to zero, by means of the function <code>calInitSubstate2Dr()</code> (lines 110-113). Moreover, it sets the values of the P.r and P.epsilon parameters (lines 116-117) and initializes the debris flow source by simply subtracting the source’s debris thickness to the topographic altitude. For this purpose, a nested double for is executed to check the debris thickness in each cell of the cellular space. Here, the <code>sciddicaT->rows</code> and <code>sciddicaT->cols</code> members of the CA object are used, which represent the cellular spece’s numbers of rows and columns, respectively. Still, the <code>calGet2Dr()</code> and <code>calSet2Dr()</code> functions are here employed to read/update substates’ values inside the cells.</p>
<p>Line 168 defines a <em>steering</em> callback by the <code>calRunAddSteeringFunc2D()</code> function. Steering is executed at the end of each computational step (i.e. after all the elementary processe have been applied to each cell of the cellular space), and can perform global operations over the cellular space. In this case, the <code>sciddicaT_simulation_init()</code> callback is registered; it must return void and takes a pointer to a simulation object as function parameter. It simply reset (to zero) the outflows everywere through the <code>calInitSubstate2Dr()</code> function.</p>
<p>The function <code>calRun2D()</code> (line 171) enters the OpenCAL simulation loop, which exectues a totoal of 4000 steps (cf. lines 14 and 148). Eventually, the final debris flow path is saved to file by means of the <code>calSaveSubstate2Dr()</code> function (line 176) and previously allocated memery is released (lines 179-180).</p>
<p>As regards the elementary processes, the first one, <span class="math inline">\(\sigma_1\)</span>, is defined at lines 38-88, while the second, <span class="math inline">\(\sigma_2\)</span>, at lines 91-101. In both cases, the <code>calGet2Dr()</code> <code>calGetX2Dr()</code> functions are employed to get substes’ values for the central cell and its neighbours, respectively. Moreover, the <code>calSet2Dr()</code> function, updates the central cell’s state.</p>
<p>Figure [fig:sciddicaT] shows the SciddicaT simulation of the 1992 Tessina (Italy) landslide. Both the initial landslide source and the final flow path configruation are shown.</p>
<div class="figure">
<img src="./images/OpenCAL/sciddicaT.png" alt="SciddicaT simulation of the 1992 Tessina (Italy) landslide. Topographic altitudes are represented in gray scale. Black represents the lower altitude, while the white color is used for the highest elevation in the study area. Debris thickness is represented with colours ranging from red (for lower values) to yellow (for higher values). (a) Initial configuration. (b) Final debris flow path. Note that the graphic output was generated by using the cal_sciddicaT-glut application, that implements the SciddicaT model and provides a minimal visualization system. You can find it in the examples directoy." width="453" />
<p class="caption">SciddicaT simulation of the 1992 Tessina (Italy) landslide. Topographic altitudes are represented in gray scale. Black represents the lower altitude, while the white color is used for the highest elevation in the study area. Debris thickness is represented with colours ranging from red (for lower values) to yellow (for higher values). (a) Initial configuration. (b) Final debris flow path. Note that the graphic output was generated by using the <code>cal_sciddicaT-glut</code> application, that implements the SciddicaT model and provides a minimal visualization system. You can find it in the examples directoy.<span data-label="fig:sciddicaT"></span></p>
</div>
<p>As regards computational preformace, the simulation shown in Figure [fig:sciddicaT] was executed on a Intel Core i7-4702HQ CPU @ 2.20GHz by exploiting only a single core. The simulation lasted a total of 172 seconds for executing a total of 4000 compuational steps.</p>
<p>Figure [fig:opencal_main_loop] shows the OpenCAL main loop. Before entering the loop, if defined, the init function is executed. Afterwards, while the current step is lower or equal to the final step of computation (or this latter is set to <code>CAL_RUN_LOOP</code>), elementary processes are executed concurrently<a href="#fn9" class="footnoteRef" id="fnref9"><sup>9</sup></a>. In this cycle, substates are updated after each elementary process has been applyed, while just before the end of the computational step, if defined, the steering function is executed. At the end of the computational step, a stop condition is checked, which can stop the simulation before the last step is reached. In order to define such a stop condition, the user can use the <code>stopCondition()</code> function, which registers a callback in which the stop condition can be defined.</p>
<div class="figure">
<img src="./images/OpenCAL/opencal_main_loop.png" alt="OpenCAL main loop chart." width="359" />
<p class="caption">OpenCAL main loop chart.<span data-label="fig:opencal_main_loop"></span></p>
</div>
<h2 id="sec:sciddicaT_active">SciddicaT with active cells optimization</h2>
<p>Here we present a computationally improved version of SciddicaT, which takes advantage of the built-in OpenCAL active cells optimization. As stated above, this optimization is able to restrict computation to a subset of cells which are actually involved in computation, by neglecting those cells for which is sure they will not change state to the next step (stationary cells).</p>
<p>In the case of SciddicaT, only cells containing debris and their neighbours can change state to the next step, as they can be interested in mass variation due to outflows and inflows. At the beginning of the simulation, we can simply initialize the set of active cells to those cells containing debris (i.e. those cells forming the initial landslide source). Moreover, we can add to this set new cells or remove some ones from it. Specifically, if an outflow is computed from an active cell towards a neighbouring non-active cell, this latter can be added to the set of active cells and considered for state change by the remaining elementary processes in the current step of computation<a href="#fn10" class="footnoteRef" id="fnref10"><sup>10</sup></a> (if any), or by the next computational step. Similarly, if a given active cell looses a sufficient amount of debris, it can be eliminated from the set of active cells. In the case of SciddicaT, this appens when its thickness becomes lower than or equal to a given threshold (i.e. <span class="math inline">\(p_\epsilon\)</span>).</p>
<p>In order to account for these processes, we have to slightly revise the SciddicaT definition. In particular we have to add the set of active cells, A. The optimized SciddicaT model is now defined as</p>
<p><span class="math display">\[SciddicaT = < R, A, X, Q , P, \sigma >\]</span> where <span class="math inline">\(A \subseteq R\)</span> is the set of active cells, while the other components are defned as before. The transition function is now defined as:</p>
<p><span class="math display">\[\sigma : A \times Q^5 \shortrightarrow Q \times A\]</span> denoting that it is applied to only the cells in <span class="math inline">\(A\)</span> and that it can add/remove active cells. More in detail, the <span class="math inline">\(\sigma_1\)</span> elementary process has to be modified, as it can activate new cell. Moreover, a new elementary process, <span class="math inline">\(\sigma_3\)</span>, has to be added in order to remove cells that cannot produce outflows during the next computational step due to the fact that their debris thickness is negligible. The new sequence of elementary processes is listed below, in the same order they are applied.</p>
<ul>
<li><p><span class="math inline">\(\sigma_1 : A \times (Q_z \times Q_h)^5 \times p_\epsilon \times p_r
\shortrightarrow Q_o^4 \times A\)</span> determines the outflows from the central cell to the neighboring ones, as before. In addition, each time an outflow is computed, the neighbour receiving the flow is added to the set of active cells.</p></li>
<li><p><span class="math inline">\(\sigma_2: A \times Q_h \times (Q_o^4)^4 \shortrightarrow Q_h\)</span> determines the value of debris thickness inside the cell by considering mass exchange in the cell neighborhood. This elementary process does not change with respect to the original version of SciddicaT.</p></li>
<li><p><span class="math inline">\(\sigma_3: A \times Q_h \times p_\epsilon \shortrightarrow A\)</span> removes a cell from <span class="math inline">\(A\)</span> if its debris thickness is lower than or equal to the <span class="math inline">\(p_\epsilon\)</span> threshold.</p></li>
</ul>
<p>In order to implement the SciddicaT debris flows model in OpenCAL by exploiting the active cells optimization, we have to chage the definition of the CA objet, by also adding the third <span class="math inline">\(\sigma_3\)</span> elementary process. Moreover, the <span class="math inline">\(\sigma_1\)</span> elementary process has to be changed. A complete implementation of the sactive cells optimized version of SciddicaT is shown in Listing [lst:cal_Sciddicat-activecells] for the sake of completeness, even if only the differences with respect to the original implementation are commented.</p>
<p>As you can easily see, few modifications to the original source code are needed to add the active cells optimization to SciddicaT. In particular, the active cells optimization is enabled by means of the <code>CAL_OPT_ACTIVE_CELLS</code> parameter at line 159, while the third elementary process added at line 165. As regards the elementary processe <span class="math inline">\(\sigma_1\)</span>, it is the same of the one of the basic SciddicaT version, with the exception that when an outflow is generated, the cell receiving the flow is added to the set A of the active cells (line 88). Moreover, an active cell is eliminated by the set A by means of the <span class="math inline">\(\sigma_3\)</span> elementary process in the case its debris thickness becomes lower than or equal to the <span class="math inline">\(p_\epsilon\)</span> threshold parameter (lines 107-108).</p>
<p>Regarding the computational preformace, the same simulation shown in Figure [fig:sciddicaT] was executed using the new SciddicaT implementation adopting the active cells implementation. Still, only a single core of the Intel Core i7-4702HQ CPU was used, as we did before. The simulation lasted a total of 22 seconds, versus 172 seconds obtained for the basic (non-optimized) version, which is about 8 times faster. This can be condidered a very good result you can easily obtain when your simulation involves only a limited subset of the computational domain.</p>
<h2 id="sec:sciddicaT_extended">SciddicaT as eXtended CA</h2>
<p>OpenCAL allows for further optimization of the SciddicaT debris flows simulation model by means of the so called <em>unsafe operations</em>. In fact, in some cases, it is possible to consider an extended definition of the computational model, allowing for operations that are not strictly permitted by the formal definition of Cellular Automata. In particular, we will allow the transition function to update the state of the neighbouring cells, while the CA only allows for state change for the central one. When we will permit such a kind of unsafe operations, we are in front of an XCA eXtended Cellular Automata. Obviously, in order to be well defined, the XCA must be equivalent to the original CA model.</p>
<p>An XCA equivalent version of SciddicaT can be obtained by obseving that, when an outflow is computed from the central cell towards a neighbour, the flow can be immediatly subtracted from the central cell and added to the neighbour. This does not change the state of the system at the current step, which is defined by the <em>current</em> computational plane, since updated values are written to the <em>next</em> computational plane. As a result, the <em>current</em> computational plane is not corrupted by the extended operation, and the <em>next</em> plane is used for progressively accounting mass variation inside the cells. By introducing such feature, ouflows don’t need to be saved (e.g. into additional substates) anymore, as they are used to account mass exchange directly during ouflows computation. As you can figure out, this can give rise to a further performace improvement for the application. The SciddicaT XCA model is formally defined as:</p>
<p><span class="math display">\[SciddicaT = < R, A, X, Q , P, \sigma >\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R\)</span> is the set of points, with integer coordinates, which defines the 2-dimensional cellular space over which the phenomenon evolves. The generic cell in <span class="math inline">\(R\)</span> is individuated by means of a couple of integer coordinates <span class="math inline">\((i, j)\)</span>, where <span class="math inline">\(0 \leq i < i_{max}\)</span> and <span class="math inline">\(0 \leq j < j_{max}\)</span>. The firt coordinate, <span class="math inline">\(i\)</span>, represents the row, while the second, <span class="math inline">\(j\)</span>, the column. The cell at coodinates <span class="math inline">\((0,0)\)</span> is located at the top-left corner of the computational grid.</p></li>
<li><p><span class="math inline">\(A \subseteq R\)</span> is the set of active cells, i.e. those cells actually involved in computation.</p></li>
<li><p><span class="math inline">\(X = \{(0,0), (-1, 0), (0, -1), (0, 1), (1, 0)\}\)</span> is the von Neumann neighborhood relation (cf. Figure [fig:2Dneighborhood]), a geometrical pattern which identifies the cells influencing the state transition of the central cell. The neighborhood of the generic cell of coordinate <span class="math inline">\((i, j)\)</span> is given by <span class="math display">\[N(X, (i, j)) =\]</span> <span class="math display">\[= \{(i, j)+(0,0), (i, j)+(-1, 0), (i, j)+(0, -1),
(i, j)+(0, 1), (i, j)+(1, 0)\} =\]</span> <span class="math display">\[= \{(i, j), (i-1, j), (i, j-1), (i, j+1), (i+1, j)\}\]</span></p>
<p>Here, a subscript operator can be used to index cells belonging to the neighbourhood. Let <span class="math inline">\(|X|\)</span> be the number of elements in X, and <span class="math inline">\(n \in
\mathbb{N}\)</span>, <span class="math inline">\(0 \leq n < |X|\)</span>; the notation</p>
<p><span class="math display">\[N(X, (i, j), n)\]</span></p>
<p>represents the <span class="math inline">\(n^{th}\)</span> neighbourhood of the cell <span class="math inline">\((i,j)\)</span>. Thereby, <span class="math inline">\(N(X, (i, j), 0) = (i, j)\)</span>, i.e. the central cell, <span class="math inline">\(N(X, (i, j), 1) =
(i-1, j)\)</span>, i.e. the first neighbour, and so on (cf. Figure [fig:LifeNeighborhood]).</p></li>
<li><p><span class="math inline">\(Q\)</span> is the set of cell states; it is subdivided in the following substates:</p>
<ul>
<li><p><span class="math inline">\(Q_z\)</span> is the set of values representing the topographic altitude (i.e. elevation);</p></li>
<li><p><span class="math inline">\(Q_h\)</span> is the set of values representing the debris thickness;</p></li>
</ul>
<p>The Cartesian product of the substates defines the overall set of state <span class="math inline">\(Q\)</span>:</p>
<p><span class="math display">\[Q = Q_z \times Q_h\]</span> so that the cell state is specified by:</p>
<p><span class="math display">\[q = (q_z, q_h)\]</span></p></li>
<li><p><span class="math inline">\(P\)</span> is set of parameters ruling the CA dynamics:</p>
<ul>
<li><p><span class="math inline">\(p_\epsilon\)</span> is the parameter which specifies the thickness of the debris that cannot leave the cell due to the effect of adherence;</p></li>
<li><p><span class="math inline">\(p_r\)</span> is the relaxation rate parameter, which affects the size of outflows (cf. section above).</p></li>
</ul></li>
<li><p><span class="math inline">\(\sigma : A \times Q^5 \shortrightarrow Q\)</span> is the deterministic cell transition function. It is composed by two elementary processes:</p>
<ul>
<li><p><span class="math inline">\(\sigma_1 : A \times (Q_z \times Q_h)^5 \times p_\epsilon \times
p_r\shortrightarrow (A \times Q_h)^5\)</span> determines the outflows from the central cell to the neighboring ones and updates debris thickness inside the central cell and its neighbours accordingly. It also adds the neighbouring cells receining a flow to the set A of the active cells.</p></li>
<li><p><span class="math inline">\(\sigma_2: A \times Q_h \times p_\epsilon \shortrightarrow A\)</span> removes the cell from the set A of the active cells if the debris thickness inside the cell is lower than or equal to the <span class="math inline">\(p_\epsilon\)</span> threshold.</p></li>
</ul></li>
</ul>
<p>Note that, only the topographic altitude and the debris thickness are now considered as model’s substates, as the four outflows substates are no longer needed. Moreover, the number of elementary process now considered is two, instead of three for the previous versions of SciddicaT. The OpenCAL implementation of the further optimized SciddicaT debris flows model is shown in Listing [lst:cal_sciddicaT-unsafe].</p>
<p>As you can see, the definitions of CA and simulation objects doesn’t change from the previous implementation (lines 131-132), while only two elementary processes are considered (lines 135-136). In particular, the firt call to <code>calAddElementaryProcess2D()</code> registers the callbak function implementing the <span class="math inline">\(\sigma_1\)</span> elementary process. It computes outflows from the (active) central cell to its neighbours (line 83) and updates the debris tickness in both the central cell and the neighbouring cell receiving a flow (lines 84-85). Moreover, neighbouring cells receiving a flow are added to the set A of active cells (line 88) and therefore will be considered for elaboration by the subsuequent elementary process (<span class="math inline">\(\sigma_2\)</span>) in the current step of computation<a href="#fn11" class="footnoteRef" id="fnref11"><sup>11</sup></a>. In particular, the <code>calSetX2Dr()</code> <em>unsafe</em> function is used to update the derbis thickess of the neighbouring cells receiving a flow, while the <code>calAddActiveCellX2D()</code> function is used to add a neighbouring cells receiving a flow to the set <span class="math inline">\(A\)</span> of active cells. The <span class="math inline">\(\sigma_2\)</span> elementary process, simply removes inactive cells from <span class="math inline">\(A\)</span> (lines 95-86), as in the previous example.</p>
<p>Substates are added to the CA at lines 139-140. Here, the first substate, <span class="math inline">\(Q_z\)</span>, is added by menas of the <code>calAddSingleLayerSubstate2Dr()</code> function. It is here considered to allocate memory only for the <em>current</em> computing plane. In fact, topographic altitutde only changes at the simulation initialization stage (cf. lines 147 and 117), while it remains inalterated during computation as its value is never updated by the transition function. This allows for memory space allocation optimization and possibly for computational performance improvements. Note that, at line 117 we used the <code>calSetCurrent2Dr()</code> function, instead of the usual <code>calSet2Dr()</code>. The <code>calSetCurrent2Dr()</code> function allows for updating the <em>current</em> computational plane (the only present in the <span class="math inline">\(Q_z\)</span> substate), while <code>calSet2Dr()</code> would update the <em>next</em> computational plane, by producing an access violation error.</p>
<p>Regarding the computational preformace, the same simulation shown in Figure [fig:sciddicaT] was executed by considering the XCA implementation of SciddicaT on a single core of the same processor. The simulation lasted a total of 11 seconds, versus 22 seconds obtained for the active cells optimized version and 172 seconds for the basic (non-optimized) version. The XCA implementation resulted 2 times faster than the active cells optimized version and about 16 times faster with than the basic one.</p>
<h2 id="sciddicat-with-explicit-simulation-loop">SciddicaT with explicit simulation loop</h2>
<p>Even if results obtained so far can be considered more than satisfying, it is further possibile to improve computational performance of SciddicaT by avoiding unnecessary substates updating. In fact, in some cases, elementary processes don’t affect one or more model’s substates and therefore their updating becomes only a waste of time.</p>
<p>As we stated above, when we use the implicit <code>calRun2D()</code> simulation loop, an update of all the defined substates is executed at the end of each elementary process. However, this behaviour can be modified by making the OpenCAL simulation loop explicit.</p>
<p>In the specific case of the SciddicaT XCA model, the second elementary process, <span class="math inline">\(\sigma_2\)</span>, just remove cells that became inactive from the set <span class="math inline">\(A\)</span> of active cells and doesn’t affect the mode’s substates<a href="#fn12" class="footnoteRef" id="fnref12"><sup>12</sup></a>. As a consequence, no substates updating is needed after the application of <span class="math inline">\(\sigma_2\)</span>. Being substates udating a time comsuming operation, this can further speed up your simulation.</p>
<p>A new OpenCAL implementation of SciddicaT is presented in Listing [lst:cal_sciddicaT-explicit]. It is based on an explicit simulation loop and also defines a stopping criterion for the simulation termination. The complete implementation is shown for the sake of completeness.</p>
<p>As you can see, the <code>calRunAddGlobalTransitionFunc2D()</code> function is called to register a custom transition function callback (line 177). In the specific case of SciddicaT, the <code>sciddicaTransitionFunction()</code> callback (lines 132-143) is used to make the elementary processes application and the substates update explicit. Here, the elementary processes are applied in the same order they are defined by means of the <code>calAddElementaryProcess2D()</code> function (which is the default behaviour of OpenCAL), even if you are free to re-order the call sequence within the explicit transition function callback. In particular, the <code>sciddicaT_flows()</code> elementary process is applied to each (active) cell into the computational domain by means of the <code>calApplyElementaryProcess2D()</code> function. Then, the set <span class="math inline">\(A\)</span> of the active cells and the <span class="math inline">\(Q_h\)</span> substate are updated by means of the <code>calUpdateActiveCells2D()</code> and <code>calUpdateSubstate2Dr()</code>, respectively<a href="#fn13" class="footnoteRef" id="fnref13"><sup>13</sup></a>. Eventually, the <code>sciddicaT_remove_inactive_cells()</code> elementary process is applied, which only removes cells that became incative during the current computational step, and the set <span class="math inline">\(A\)</span> is accordingly updated.</p>
<p>As regard the computational performance, this further optimized version lasted 12 seconds to complete the 4000 steps required by the simulation on a single core of the same Intel Core i7 processor used before. The obtained sped up is here quite small (less than the 10%) with respect to the previous implementation of SciddicaT. However, SciddicaT is a very simplyfied model and higer speed up can certainly be obtained for more complex CA made up by more elementary processes, theese latter involving only a small set of model’s substates. Table [tab:speedup] resumes computational performace of all the above illustraed SciddicaT implementations.</p>
<table>
<caption>Computational performace of the four different implementations of the SciddicaT debris flows model.<span data-label="tab:speedup"></span></caption>
<thead>
<tr class="header">
<th align="left">CA model</th>
<th align="center">Elapsed time [s]</th>
<th align="center">Speedup</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">SciddicaT</td>
<td align="center">240</td>
<td align="center">1</td>
</tr>
<tr class="even">
<td align="left">SciddicaT with active cells</td>
<td align="center">23</td>
<td align="center">10.43</td>
</tr>
<tr class="odd">
<td align="left">SciddicaT XCA (eXtended CA)</td>
<td align="center">13</td>
<td align="center">18.46</td>
</tr>
<tr class="even">
<td align="left">SciddicaT XCA explicit update</td>
<td align="center">12</td>
<td align="center">20</td>
</tr>
</tbody>
</table>
<h2 id="sec:mod2">A three-dimensional example</h2>
<p>In order to introduce you to development with of three-dimensional CA with OpenCAL, we start this section by implementing a simple 3D model, namely the <em>mod2</em> 3D CA. Cells can be in one of two differnt states, 0 or 1, as in Game of Life. The cellular space is a parallelepiped made by cubic cells, while the cell’s neighbourhood is the 3D Moore one, consisting of the central cell and its adjecent cells. The transition function simply evaluates the quantity <span class="math inline">\(s\)</span> as the number of neighbouring cells which are in the state 1 and sets the new state for the central cell as <span class="math inline">\(s\%2\)</span> (i.e. the rest of <span class="math inline">\(s\)</span> divided by 2). This simple example of 3D CA is formally defined as:</p>
<p><span class="math display">\[mod2 = < R, X, Q, \sigma >\]</span></p>
<p>where:</p>
<ul>
<li><p><span class="math inline">\(R\)</span> is the set of points, with integer coordinates, which defines the 3-dimensional cellular space. The generic cell in <span class="math inline">\(R\)</span> is individuated by means of a triple of integer coordinates <span class="math inline">\((i, j,
k)\)</span>, where <span class="math inline">\(0 \leq i < i_{max}\)</span>, <span class="math inline">\(0 \leq j < j_{max}\)</span>, and <span class="math inline">\(0 \leq k
< k_{max}\)</span>. The firt coordinate, <span class="math inline">\(i\)</span>, represents the row, the second, <span class="math inline">\(j\)</span>, the column, while the third coordinate represents the slice. The cell at coodinates <span class="math inline">\((0,0,0)\)</span> is located at the top-left-far corner of the computational grid.</p></li>
<li><p><span class="math inline">\(X = \{(0,0,0), \dots, (-1,1,0), (0,0,-1), \dots, (-1,1,-1),
(0,0,1), \dots, (-1,1,1)\}\)</span> is the Moore neighborhood relation, a geometrical pattern which identifies the cells influencing the state transition of the central cell. The neighborhood of the generic cell of coordinate <span class="math inline">\((i, j)\)</span> is given by <span class="math display">\[N(X, (i, j, k)) =\]</span> <span class="math display">\[= \{(i, j, k)+(0,0,0), \dots, (i, j, k)+(-1,1,-1)\} =\]</span> <span class="math display">\[= \{(i, j, k), \dots, (i-1,j+1,k-1)\}\]</span> Here, a subscript operator can be used to index cells belonging to the neighbourhood. Let <span class="math inline">\(|X|\)</span> be the number of elements in X, and <span class="math inline">\(n \in
\mathbb{N}\)</span>, <span class="math inline">\(0 \leq n < |X|\)</span>; the notation</p>
<p><span class="math display">\[N(X, (i, j, k), n)\]</span></p>
<p>represents the <span class="math inline">\(n^{th}\)</span> neighbourhood of the cell <span class="math inline">\((i,j,k)\)</span>. Thereby, <span class="math inline">\(N(X, (i, j, k), 0) = (i, j, k)\)</span>, i.e. the central cell, <span class="math inline">\(N(X, (i, j, k), 1)
= (i-1, j, k)\)</span>, i.e. the first neighbour, and so on.</p></li>
<li><p><span class="math inline">\(Q = \{0, 1\}\)</span> is the set of cell states.</p></li>
<li><p><span class="math inline">\(\sigma : Q^{27} \shortrightarrow Q\)</span> is the deterministic cell transition function. It is composed by one elementary process, which implements the previously descripted transition rules.</p></li>
</ul>
<p>As you can imagine, the OpenCAL implementation of the <em>mod2</em> 3D CA is very simple, as the Conway’s game of Life is. The complete source code is shown in Listing [lst:cal_mod2CA3D].</p>
<p>As you can see, even if Listing [lst:cal_life] is very short, it completely defines the <em>mod2</em> 3D CA and perform a simulation (actually, only one step in this example). Lines 3-5 include some header files for the 3D version of OpenCAL, while lines 12-14 declare CA, substate and simulation objects. They are therefore defined later in the main function. In particular, line 30 defines the CA as a parallelepiped having <code>ROWS</code> rows, <code>COLS</code> columns and <code>SLICES</code> slices (cf. lines 7-9). Moreover, the 3D Moore neighbourhood is here set as well as cyclic conditions at boundaries. Eventually, no optimizations are considered. Line 31 defines the simulation object by setting just one step of computation and implicit substate’s update. Finally, the only substate, <span class="math inline">\(Q\)</span>, is defined at line 34. Note that, since it was defined by means of the <code>calInitSubstate3Db()</code> function, each element <span class="math inline">\(q \in Q\)</span> results to be of the CALbyte type. Line 37 registers the only CA’s elementary process, namely the <code>mod2_transition_function()</code> function, which is then implemnented at lines 17-25. Line 43 initializes the cell’s substated <span class="math inline">\(Q\)</span> at coordinates (2, 3, 1) to the state 1. The obtained initial configuration is then saved to disk at line 46, and the simulation ran at line 49. The final configuration is therefore saved to disk at line 52 and, eventually, memory previously and implicitly allocated is released at lines 55-56. Note that, diffrently to the previous examples, almost all the OpenCAL functions come in the 3D flower. For instace, this is the case of the <code>alGetX3Db()</code> and <code>calSet3Db()</code> functions at lines 22 and 24, respectively, which take <code>k</code> as third cell’s coordinate, identifying the cellular space’s slice.</p>
<p>Figures [fig:mod2_0000] and [fig:mod2_LAST] show the initial and final configuration of <em>mod2</em> 3D CA as implemented in Listing [lst:cal_mod2CA3D], respectively. A graphical representation after 77 computational step is shown in Figure [fig:cal_mod2CA3D].</p>
<div class="figure">
<img src="./images/OpenCAL/mod2_0000.png" alt="Initial configuration of mod2 3D CA, as implemented in Listing [lst:cal_mod2CA3D]." width="132" />
<p class="caption">Initial configuration of mod2 3D CA, as implemented in Listing [lst:cal_mod2CA3D].<span data-label="fig:mod2_0000"></span></p>
</div>
<div class="figure">
<img src="./images/OpenCAL/mod2_LAST.png" alt="Final configuration of mod2 3D CA (actually, just one step of computation), as implemented in Listing [lst:cal_mod2CA3D]." width="132" />
<p class="caption">Final configuration of mod2 3D CA (actually, just one step of computation), as implemented in Listing [lst:cal_mod2CA3D].<span data-label="fig:mod2_LAST"></span></p>
</div>
<div class="figure">
<img src="./images/OpenCAL/mod23DCA-glut.png" alt="Graphical representation of the mod2 3D CA after 77 computational steps, as implemented in Listing [lst:cal_mod2CA3D]. CA dimensions were set to (rows, cols, slices) = (65, 65, 65), while the initial seed located at coordinates (12, 12, 12). Cells in black are in the state 0, cell in white are in the state 1." width="453" />
<p class="caption">Graphical representation of the mod2 3D CA after 77 computational steps, as implemented in Listing [lst:cal_mod2CA3D]. CA dimensions were set to (rows, cols, slices) = (65, 65, 65), while the initial seed located at coordinates (12, 12, 12). Cells in black are in the state 0, cell in white are in the state 1.<span data-label="fig:cal_mod2CA3D"></span></p>
</div>
<h2 id="combining-opencal-and-opencal-gl">Combining OpenCAL and OpenCAL-GL</h2>
<p>In this section, we will show how you can combine the OpenCAL and OpenCAL-GL libraries to have an immediate graphic output of your simulation. At this purpose, here we re-propose two of the examples presented above, namely the SciddicaT debris flow model and the <em>mod2</em> CA, by adding them an OpenGL-based viewer in a straightforward manner. The visualization system provided by OpenCAL-GL is however rather simplyfied and therefore, if you need a more sophisticated graphic system you have to develop an alternative by yourself.</p>
<h3 id="implementing-sciddicat-in-opencal-and-opencal-gl">Implementing SciddicaT in OpenCAL and OpenCAL-GL</h3>
<p>A new implementation of SciddicaT is presented in Listing [lst:calgl_sciddicaT], which integrates a simple multi-view 2D and 3D visualization system based on OpenCAL-GL. The complete implementation is shown for the sake of completeness in Listing [lst:calgl_sciddicaT]. A screenshot is shown in Figure [fig:calgl_sciddicaT1].</p>
<div class="figure">
<img src="./images/OpenCAL/calgl_sciddicaT1.png" alt="Screenshot of the SciddicaT debris flow model with a multi-view 2D and 3D visualization system based on OpenCAL-GL." width="453" />
<p class="caption">Screenshot of the SciddicaT debris flow model with a multi-view 2D and 3D visualization system based on OpenCAL-GL.<span data-label="fig:calgl_sciddicaT1"></span></p>
</div>
<p>As you can see, two visualization objects of type <code>CALDrawModel2D</code> are declared into the <code>main()</code> function. Each of them allows to define a particular view of the CA inside a graphic windows.</p>
<p>The <code>calglDefDrawModel2D()</code> function is used to initialize graphic objcts (it acts as a constructor). The first parameter specifies the type of drawing to be considered and can assume the values listed in Table [tab:draw_modes]. The second parameter represents the title to label the portion of the graphic window containing the view. The third parameter is a pointer to the CA object to be visualized, while the last one is a pounter to the corresponding simulation object.</p>
<p>To build a specifig view for the CA, we use the <code>calglAddToDrawModel2Dr()</code> function. It adds a node into a tree representation, where each node is linked to a CA substate and specifies how this substate have to be considered for representation. In particular, a susbstate can be used as vertex data, normal data and color data.</p>
<table>
<caption>Possible OpenCAL-GL drawing types.<span data-label="tab:draw_modes"></span></caption>
<thead>
<tr class="header">
<th align="left">Drawing type</th>
<th align="left">Meaning</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left"><code>CALGL_DRAW_MODE_FLAT</code></td>
<td align="left">Each cell is interpreted as an element of a matrix</td>
</tr>
<tr class="even">
<td align="left"><code>CALGL_DRAW_MODE_SURFACE</code></td>
<td align="left">Each cell is assumed to belong to a topographic surface</td>
</tr>
</tbody>
</table>
<h3 id="implementing-the-mod2-ca-in-opencal-and-opencal-gl">Implementing the <em>mod2</em> CA in OpenCAL and OpenCAL-GL</h3>
<h2 id="opencal-global-functions">OpenCAL Global Functions</h2>
<p>OpenCAL comes with some functions you can use to perform global operation over the cellular space. They essentially are reduction function. Note that such functions should not used within elementary processes, but only in the initialization, steering and stop condition functions. The proper use of global functions is your own responsibility.</p>
<p>In ordert to use global functions in your application, simply include the XXXXXX. Table YYYYYY lists the OpenCALì’s global functions.</p>
<h1 id="ch:opencal-omp">OpenCAL OpenMP version</h1>
<p>Nowadays, parallel computing is the most effective solution to overcome temporal limits of sequential computation. With the name OpenCAL-OMP, we identify the OpenMP implementation of the software library, that can run on all cores for your CPU. If you are lucky and have a shared memory multiptocessor system, OpenCAL-OMP can also exploit all the cores of all your CPUs.</p>
<p>Similarly to the serial version, OpenCAL-OMP allows for some <em>unsafe operations</em>, which can significantly speed up your application. However, when you use OpenCAL-OMP in <em>unsafe mode</em> you must give the utmost attention to avoid <em>race condition</em> issues. For instance, when many threads perform concurrent operations on the same memory locations and such operations are made by more than one atomic machine instruction, it can happen that they can interleave, giving rise to wrong (i.e., non consistent) results. Furthermore, even in the case of atomic operations, the logic order of execution could not be respected. Thus, for instance, a read-write logic sequence of atomic operations can actually become a write-read (wrong) sequence due to the fact that the thread performing the write operation is executed first.</p>
<p>In the following sections we will introduce OpenCAL-OMP by examples, highlighting source code the differences with respect to the serial implementations shown in Chapter [ch:opencal]. In the first part of the Chapter, we will deal with the OpenCAL’s <em>safe mode</em>, while in the last one, we will discuss unsafe operations.</p>
<h2 id="conways-game-of-life-in-opencal-omp">Conway’s Game of Life in OpenCAL-OMP</h2>
<p>In Section [sec:cal_life], we described Conway’s Game of Life and shown a possible implementation using the <code>OpenCAL</code> serial library. Here, we present a <code>OpenCAL-OMP</code> implementation of the same cellular automaton (Listing [lst:calomp_life]), by discussing the differences with respect its serial implementation (Listing [lst:cal_life]).</p>
<p>As you can see, the OpenMP-based implementation of Life, which uses only safe operations, is almost identical to the serial one thanks to the seamless parallelization adopted by the library. The only differences can be found at lines 3-5 where, instead of including the OpenCAL header files, you can find the OpenCAL-OMP headers. All the remaining source code is unchanged. Note that also the OpenCAL-OMP statements’ prefix does not change with respect to the OpenCAL’s one (i.e. <code>cal</code> for the functions, <code>CAL</code> for the data types, and <code>CAL_</code> for the constants). In practice, if you only use OpenCAL-OMP in safe mode, besides including the proper OpenCAL-OMP header files instead of the OpenCAL ones, minimal changes are required to the serial code.</p>
<h2 id="sciddicat">SciddicaT</h2>
<p>As for the case of Conway’s Game of Life, even the OpenCAL-OMP implementation of the SciddicaT cellular automaton, shown in Lsting [lst:calomp_sciddicaT], does not significantly differ from the serial implementation that you can find in the Section [sec:sciddicaT], Listing [lst:cal_sciddicaT]. As before, the only differences regard the included headers (lines 3-5). Even in this case, as for the Life cellular automaton, due to the fact we used only OpenCAL-OMP safe operations, mimimal code change is required, besides including the proper OpenCAL-OMP header files instead of the OpenCAL ones.</p>
<h2 id="sciddicat-with-active-cells-optimization">SciddicaT with active cells optimization</h2>
<p>Here we present an OpenCAL-OMP implemenation of SciddicaT, which takes advantage of the built-in OpenCAL active cells optimization feature. You can find the complete source code in Listing [lst:calomp_Sciddicat-activecells], while the corresponding serial implementation can be found in Section [sec:sciddicaT_active], Listing [lst:cal_Sciddicat-activecells].</p>
<p>With respect to the Sciddica implementation shown in Listing [lst:calomp_sciddicaT], which is exclusively based on safe OpenCAL-OMP operations, the active cells management as implemented here requires an unsafe operations. Such unsafe operations are performed by means of the <code>calAddActiveCellX2D()</code> function (line 87), which adds a cell belonging to the neighbourhood to the set <span class="math inline">\(A\)</span> of active cells. As evident, such an operation is considered unsafe because it can give rise to race condition. In fact, if more threads try to add the same cell to the set <span class="math inline">\(A\)</span> at the same time, being this a non-atomic operation, threads’ operations can interleave and the outcome be wrong. In order to avoid this possible error, OpenCAL-OMP is able to <em>lock</em> the memory locations involved in the operations so that each thread can entirely perform its own task without the risk that other threads interfere. In order to do this, it is sufficient to place OpenCAL-OMP in <em>unsafe</em> state by calling the <code>calSetUnsafe2D()</code>, as done at line 163. No other modifications to the serial source code are required.</p>
<h2 id="sciddicat-as-extended-ca">SciddicaT as eXtended CA</h2>
<p>Here we present an OpenCAL-OMP implementation of SciddicaT, which takes advantage of the built-in unsafe operations. They belong to the eXtended CA definition and allow for further computational optimizations. You can find the complete source code of SciddicaT implemented as an eXtended CA in Listing [lst:calomp_SciddicaT-unsafe], while the corresponding serial implementation can be found in Section [sec:sciddicaT_extended], Listing [lst:cal_sciddicaT-unsafe].</p>
<p>First of all - from a XCA modeling point of view - note that only the topographic altitude and the debris thickness are now considered as model’s substates (lines 25-28, 147-148), as the four outflows substates are no longer needed. Moreover, the number of elementary process now considered is two (lines 143-144), instead of three for the previous versions of SciddicaT.</p>
<p>The call to the <code>calSetUnsafe2D()</code> function (line 139) places OpenCAL-OMP in unsafe mode, allowing to lock memory locations (i.e. cells) that can be simultaneously accessed by more threads. In order properly exploit the OpenCAL-OMP’s built in lock feature, you have to use specific functions, which are provided by the <code>OpenCAL-OMP/cal2DUnsafe.h</code> header file (line 6). In the specific case, besides the already discussed <code>calAddActiveCellX2D()</code> function, the <code>calAddNext2Dr()</code> and <code>calAddNextX2Dr()</code> functions are employed (lines 88-89), in place of the combination of get-set operations, as done in the corresponding serial implementation (Listing [lst:cal_sciddicaT-unsafe], lines 84-85). In fact, consider the source code snippet in Listing [lst:get-set] (checked out by Listing [lst:cal_sciddicaT-unsafe]). As you can see, for each not-eliminated cell, the algorithm computes a flow, <span class="math inline">\(f\)</span> (line 5) and then subtracts it from the central cell (line 6), adding it to the corresponding neighbour (line 7), in order to accomplish mass balance. In both cases (flow subtraction and adding), a flavor of <code>calGet</code> function is called to read the current value of the <span class="math inline">\(Q_h\)</span> substate from the next working plane. Subsequently, a flavor of the <code>calSet</code> function is used to update the previously read value. When a single thread is used to perform such operations, no race conditions can obviously occur. At the contrary, even in the case of two concurrent threads, different undesirable situations can take place, which give rise to a race condition and therefore to a wrong result. For instance, let’s suppose both threads read the value first, and then write their updated values; in this case, the resulting value will correspond to the one written by the thread that writes the value for last, and the contribution of the other thread will be lost.</p>
<pre><code> // <snip>
for (n=1; n<sciddicaT->sizeof_X; n++)
if (!eliminated_cells[n])
{
f = (average-u[n])*P.r;
calSet2Dr (sciddicaT,Q.h,i,j, calGetNext2Dr (sciddicaT,Q.h,i,j) -f);
calSetX2Dr(sciddicaT,Q.h,i,j,n,calGetNextX2Dr(sciddicaT,Q.h,i,j,n)+f);
// <snip>
}
// <snip></code></pre>
<p>In order to avoid such kind of problems when dealing with more threads, the above mentioned <code>calAddNext2Dr()</code> and <code>calAddNextX2Dr()</code> functions lock the cell under consideration and then perform the get-set operations without the risk other threads can interfere. In this way, no race conditions can be triggered. Obviously, there is a side-effect in terms of computational performance. In fact, as expected, locks can slow down threads execution and therefore the entire simulation.</p>
<h2 id="sciddicat-with-explicit-simulation-loop-1">SciddicaT with explicit simulation loop</h2>
<p>As for the serial version, also for the OpenMP based release of OpenCAL it is further possible to improve computational performance of SciddicaT by avoiding unnecessary substates updating.</p>
<p>As already reported, the <code>calRun2D()</code> function used so far to run the simulation loop updates all the defined substates at the end of each elementary process. However, in the specific case of the SciddicaT XCA model, no substates updating should be executed after the application of the second elementary process, as this just removes inactive cells from the set <span class="math inline">\(A\)</span>.</p>
<p>A new OpenCAL implementation of SciddicaT is presented in Listing [lst:calomp_sciddicaT-explicit]. It is based on an explicit global transition function, defined by means of <code>calRunAddGlobalTransitionFunc2D()</code>. It registers a callback function within which you can both reorder the sequence of elementary processes to be applied in the generic computational step, and also select which substates have to be updated after the execution of the different elementary processes. The SciddicaT implementation here presented in Listing [lst:calomp_sciddicaT-explicit] also makes explicit the simulation loop and defines a stopping criterion for the simulation termination.</p>
<h2 id="sciddicat-computational-performance">SciddicaT computational performance</h2>
<p>Table [tab:speedup] resumes computational performance of all the above illustrated SciddicaT implementations as implemented in OpenCAL-OMP. The considered case of study is the simulation of the Tessina landslide shown in Figure [fig:sciddicaT], which required a total of 4000 computational steps. The adopted CPU is a Intel Core i7-4702HQ @ 2.20GHz 4 cores (8 threads) processor, already considered for the performance evaluation of the corresponding serial SciddicaT implementations described in Chapter [ch:opencal]. Results are provided both in terms of elapsed time and speed up with respect to the corresponding serial version. Elapsed times of the serial simulations are also reported.</p>
<table>
<caption>Speedup of the four different implementations of the SciddicaS3hex debris flows model accelerated by OpenMP.<span data-label="tab:speedup"></span></caption>
<thead>
<tr class="header">
<th align="left">T version</th>
<th align="center">Serial [s]</th>
<th align="center">1thr</th>
<th align="center">2thr</th>
<th align="center">4thr</th>
<th align="center">6thr</th>
<th align="center">8thr</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">naive</td>
<td align="center">240s</td>
<td align="center">0.82 (293s)</td>
<td align="center">1.22 (196s)</td>
<td align="center">1.53 (157s)</td>
<td align="center">1.64 (146s)</td>
<td align="center">1.6 (150s)</td>
</tr>
<tr class="even">
<td align="left">active cells</td>
<td align="center">23s</td>
<td align="center">0.77 (30s)</td>
<td align="center">1.36 (17s)</td>
<td align="center">1.77 (13s)</td>
<td align="center">2.09 (11s)</td>
<td align="center">2.3 (10s)</td>
</tr>
<tr class="odd">
<td align="left">eXtended CA</td>
<td align="center">13s</td>
<td align="center">0.77 (17s)</td>
<td align="center">1.86 (7s)</td>
<td align="center">2.6 (5s)</td>
<td align="center">2.17 (6s)</td>
<td align="center">2.6 (5s)</td>
</tr>
<tr class="even">
<td align="left">explicit loop</td>
<td align="center">12s</td>
<td align="center">0.75 (16s)</td>
<td align="center">1.2 (10s)</td>
<td align="center">2.4 (5s)</td>
<td align="center">2.4 (5s)</td>
<td align="center">3.0 (4s)</td>
</tr>
</tbody>
</table>
<p>As you can see, results are quite good. In particular, the better results in terms of speed up were obtained for the fully optimized SciddicaT implementation (with the explicit substate updating feature), which runs 3 time faster than the corresponding serial version when executed over 8 threads. Nevertheless, consider that the SciddicaT simulation model here adopted is quite simple and better performance in terms of speed up can certainly be obtained for CA models with more complex transition functions and extended computational domains.</p>
<p>Eventually, please note how progressive optimizations can considerably reduce the overall execution time. In fact, if for the naive (i.e., non optimized at all) serial implementation the elapsed time was 240s, for the fully optimized parallel version the simulation lasted only 3 seconds, corresponding to a speed up value of 80, i.e. the fully optimized parallel version runs 80 times faster than the serial naive implementation.</p>
<h2 id="a-three-dimensional-example">A three-dimensional example</h2>
<p>In Section [sec:mod2], we described the <em>mod2</em> 3D CA and shown a possible implementation using the <code>OpenCAL</code> serial library. Here, we briefly present a <code>OpenCAL-OMP</code> implementation of the same cellular automaton (Listing [lst:calomp_life]), by discussing the differences with respect the corresponding serial implementation (Listing [lst:cal_life]).</p>
<p>As you can see, the OpenMP-based implementation, which uses only safe operations, is almost identical to the serial one. As for the case of the Game of Life CA, the only differences can be found at lines 3-5 where, instead of including the OpenCAL header files, we included the OpenCAL-OMP headers. All the remaining source code is unchanged.</p>
<h1 id="ch:opencal-cl">OpenCAL OpenCL version</h1>
<p>This chapter introduces OpenCAL-CL, a porting of OpenCAL in OpenCL. OpenCL is a parallel framework originally proposed by Apple and then released as an Open Standard under the Khronos Group management. Besides computational efficiency, one of the main advantages of OpenCL is portability. In fact, you can run your program wherever you want across heterogeneous processors like Central Processing Units (CPUs), Graphics Processing Units (GPUs), Digital Signal Processors (DSPs), and Field-Programmable Gate Arrays (FPGAs).</p>
<p>OpenCAL-CL inherits many OpenCAL’s features, by also adding parallel computation capability thanks to the adoption of OpenCL. The application is now subdivided in two parts: the <em>host program</em>, running on the CPU, and the <em>device program</em>, running on a compliant computational device (e.g. an Nvidia or AMD GPU). The CA object is still defined host-side, as in OpenCAL, while elementary processes (and possibly other functions) are defined device-side. Belonging to the device program, CA elementary processes must be defined as OpenCL’s <em>kernels</em> and therefore the programmer has to be able to write some minimal OpenCL code to implement them. Fortunately, OpenCAL-CL hides lots of parallel aspects to the user (e.g. the simulation loop is internally managed by the library) and also simplifies data exchange between host and device. The user’s OpenCL parallel programming background can be therefore limited or even null. In the latter case, the user can learn some basic elements of OpenCL kernel programming thanks to this guide.</p>
<p>This chapter is divided into two parts: the first part is a very brief overview of OpenCL, while the second one introduces OpenCAL-CL by examples.</p>
<h2 id="sec:openclstructure">OpenCL framework</h2>
<p>OpenCL enables parallel programming that assigns computational tasks to multiple processing elements to be performed at the same time. These tasks are called <em>kernels</em>. A kernel is a special function that can be executed to different OpenCL-compliant devices. In the CA modeling context, each elementary process of the CA transition function is devised in the OpenCAL library as a kernel. The kernels are sent to the device(s) by the host application. The host application defines a structure called <em>context</em>, needed to manage data exchange and computation on the compliant devices.</p>
<p>In particular, host application links kernels into one or more programs. In the host application, the user can select the kernel’s functions to insert inside a container called <em>program</em>. The program connects the kernel with argument data and dispatches it to a structure called <em>command queue</em>. The command queue is a structure that allows the host to decide what the devices have to do and, when a kernel is enqueued, the device will execute the relative function.</p>
<div class="figure">
<img src="./images/OpenCAL-CL/kernelDistribution.png" alt="General structure of a OpenCL program" width="453" />
<p class="caption">General structure of a OpenCL program<span data-label="fig:GeneralStructure"></span></p>
</div>
<p>As we can see from the picture above, the context contains all the devices, all the command queues and all the kernels. More in detail, for every device a command queue is associated and every command queue has inside all the kernel functions that the device has to execute.</p>
<h2 id="the-structure-of-opencal-cl">The structure of OpenCAL-CL</h2>
<p>As in OpenCL applications, the OpenCAL-CL library has a set of structures and functions to develop host program and to define kernels. The definition of the CA is specified in the host program. The user can define two types of models, 2D or 3D, as shown in [ch:opencal]. The host program manages the kernels and dispatches the data of the mode (e.g., CA substates, the type of neighbourhood, size of the cellular space, etc) to the computation units. The host program manages the execution loop and which computation unit has to execute the transition functions.<br />
The host program is typically divided in the following sections:</p>
<ul>
<li><p>definition of the model (Chapter [ch:opencal])</p></li>
<li><p>management of the OpenCL devices</p></li>
<li><p>kernels allocation</p></li>
<li><p>data dispatch from the model to devices</p></li>
<li><p>execution loop start</p></li>
</ul>
<h2 id="host-programming">Host programming</h2>
<h3 id="definition-of-the-model">Definition of the model</h3>
<p>XXX</p>
<h3 id="manage-of-the-devices">Manage of the devices</h3>
<p>After the model creation, the user must choose the device for the kernel computation. Inside the library, a structure called CALOpenCL allows the user to manage all available platforms and devices. This structure simplifies the access to the devices compared with the native API of OpenCL. The library supplies other functions to know which platforms and devices are available on the system and to have information about these.<br />
Below you can see a simple program that explains how the CALOpenCL structure can be used.</p>
<pre><code>#include <calCL2D .h>
int main (){
CALOpenCL * calOpenCL = calclCreateCALOpenCL (); // get all available
platforms calclInitializePlatforms ( calOpenCL ); // get all
available devices calclInitializeDevices ( calOpenCL );
// get the first device on the first platform CALCLdevice device =
calclGetDevice ( calOpenCL , 0, 0);
// create a context CALCLcontext context = calclcreateContext (&
device , 1);
return 0; }</code></pre>
<p>Inside the library, the platforms and the devices are stored in a matrix where rows represent the platforms and columns represent the devices. Thus, to choose which one we can use for the computation, it is necessary to specify the index of platform and the index of the device. For example, at lines 12, we chose the platform number 0 and the device number 0. If we have a system with 3 NVIDIA GPUs and 3 AMD GPUs, the library will have a <span class="math inline">\(2 \times 3\)</span> size matrix, where 2 are the vendors (i.e., the platforms NVIDIA and AMD) and 3 are the GPUs for each platforms. If we want to run the program using the third AMD GPU, we can specify 1 and 2 as indices. If we don’t know how the system identifies the platforms and devices, the library supplies us a function called <code>calclGetPlatformsAndDeviceFromStandardInput</code> that allows us to know the available platforms and devices. First it prints the information on standard output and then we can insert the indexes directly from standard input.<br />
After the device is chosen, the user must specify the path where the kernels and relative headers are. Through the function <code>calclLoadProgramLib(2D/3D)</code> the library reads automatically the kernels and compiles them.</p>
<pre><code>CALCLprogram calclLoadProgramLib(2D|3D) ( CALCLcontext context ,
CALCLdevice device , char * path_user_kernel , char *
path_user_include )</code></pre>
<h3 id="allocation-of-kernels">Allocation of kernels</h3>
<p>The library doesn’t know which kernels are related to the CA elementary processes, nor their execution order on the available devices. Thus, to create and allocate a kernel it’s necessary to call the function <code>calclGetKernelFromProgram</code> that retrieves an OpenCL kernel given a compiled OpenCL program.</p>
<pre><code>cl_kernel elementaryProcess = calclGetKernelFromProgram(&program,
KERNEL_NAME);</code></pre>
<h3 id="send-data-from-the-model-to-devices">Send data from the model to devices</h3>
<p>To transfer the data from host side to kernel side, the user must define the structure called <code>CALCLToolkit(2D/3D)</code> containing all the buffers (data) of the model. By default, the library sends to all the kernels the data associated to the model. The following list shows the data which is sent:</p>
<ul>
<li><p>the dimension of cellular space</p></li>
<li><p>the number of substate for every type (i.e., byte, int, real)</p></li>
<li><p>the substates allocated from the user</p></li>
<li><p>the list of active cells</p></li>
<li><p>the list of active cells flags</p></li>
<li><p>type, dimension and ID of the neighbourhood</p></li>
<li><p>the border condition</p></li>
</ul>
<p>First, the user must create an instance of the structure <code>CALCLToolkit(2D/3D)</code> by calling the function <code>calclCreateToolkit</code> as shown in the following code snippet.</p>
<pre><code>CALCLToolkit2D * calclCreateToolkit (2D|3D)(struct CALModel (2D|3D)
*model ,CALCLcontext context ,CALCLprogram program ,CALCLdevice device
,CALCLOptimization opt)</code></pre>
<p>The enumerative <code>CALCLOptimization</code> allows the user to choose if she/he wants to use the library without optimization <code>(CALCL_NO_OPT)</code> or with active cells optimization <code>(CALCL_OPT_ACTIVE_CELLS)</code>. The structure <code>CALCLToolkit(2D/3D)</code> doesn’t contain only the buffers to transfer data but also the kernels belonging to the execution loop.<br />
To add a new kernel to the execution loop, the user has to call the function <code>calclAddElementaryProcessKernel(2D/3D)</code> that adds the chosen kernel to the list of CA elementary processes; moreover, it sends to the device all the necessary data to execute the kernel.</p>
<pre><code>void calclAddElementaryProcessKernel2D(CALCLToolkit2D * toolkit2d,
struct CALModel2D *model, CALCLkernel * elemProcKernel);</code></pre>
<h3 id="start-execution-loop">Start execution loop</h3>
<p>To start the execution loop, the user has to call the function <code>calclRun(2D/3D)</code>. This function executes all the elementary processes previously declared on the specified device.</p>
<pre><code>void calclRun2D(CALCLToolkit2D* toolkit2d, struct CALModel2D * model,
unsigned int initialStep, unsigned maxStep);</code></pre>
<h2 id="kernel-programming">Kernel programming</h2>
<p>In order to program kernels in OpenCAL, the user needs to include some header files. Specifically, cal2D.h or cal3D.h permit to use some simple functions to interact with the data structures belonging to the model. To create a kernel function in OpenCL, user must place the keyword <code>__kernel</code> before the returning the type of the function. In OpenCAL-CL every time a kernel function, the keyword <code>MODEL_DEFINITION2D</code> must be specified as first parameter and the function <code>initThreads2D()</code> called as first instruction.<br />
The code below shows how to declare a new kernel.</p>
<pre><code>__kernel void kernel(MODEL_DEFINITION2D){ initThreads2D(); ... ...
... }</code></pre>
<p>When the user implements an elementary process - by defining its kernel function - she/he can rely on a set of OpenCAL functions that allow to get the substates values of both the central and the neighbouring cells, and to update the substates values of the central cell. Every time the user wants to use this function, the keyword <code>MODEL2D</code> must be passed as first parameter. For instance, if we want to retrieve the value of a specific cell we need to use the function <code>calGet2Dr</code>.</p>
<pre><code> double a = calGet2Dr(MODEL2D, 0, i, j);</code></pre>
<p>The function returns the value of the cell (i, j) of the substate 0.<br />
Since inside OpenCAL-CL one can use all OpenCL features, all memory levels such as global memory, local memory, private memory (XXXcitazione libroXXX) can be exploited. In order to use these memory levels, variables must be declared using this specific syntax respectively <code>__global</code>, <code>__local</code>, <code>__private</code>.</p>
<h2 id="conways-game-of-life-with-opencal-cl">Conway’s Game of Life with OpenCAL-CL</h2>
<p>As already reported in [sec:cal_life], to introduce how to develop a Cellular Automata model with OpenCAL-CL we can start by implementing Conway’s Game of Life and specifically its host application part. In order to use OpenCAL-CL, some header files are included (lines 3-8) and, in particular, the OpenCAL library and the OpenCAL-CL library. The OpenCAL library inclusion inside the host application is necessary to define the CA object (line 46) and the related substates (line 49). The cal2DIO.h header file (line 4) provides some basic input/output functions for reading/writing substates from/to file. To run the application with OpenCAL-CL, all the necessary objects have to be declared (line 27-31): specifically, the <code>CALOpenCL</code> structure (line 27), the <code>CALCLcontext</code>, the <code>CALCLdevice</code>, the <code>CALCLprogram</code> and the <code>CALCLToolkit2D</code>. To create and initialize these variables we need to call the function <code>calclCreateCALOpenCL</code> that creates the CALOpenCL structure, the functions <code>calclInitializePlatforms</code> and <code>calclInitializeDevices</code> that initialize all the platforms and devices on the available machine and the <code>calclGetDevice()</code> function, which will return the device where elementary process are computed by choosing the <code>platformNum</code> and <code>deviceNum</code> integers(line 24-25). Moreover, two important functions are here considered: <code>calclcreateContext</code> and <code>calclLoadProgramLib2D</code>. The first returns the context [sec:openclstructure], while the second builds all kernels inside the path specified by <code>kernelSrc</code>(line 15) including the files specified by <code>kernelInc</code>(line 16), and returns a program containing all the compiled kernels.</p>
<p>The definition and declaration of the model (line 46) has the same code that we discussed in [sec:cal_life], and includes all the information to create the model, to create and initialize substates (line 49-52) and to set a initial configuration (line 55-59).</p>
<p>Although the OpenCAL-CL library uses the same functions of the OpenCAL library to create the model and the substates, the execution of the elementary processes is quite different. As known, our elementary processes are implemented on GPU side through kernels, so we need to manage the transfer of memory between host and device sides and to decide which and when run the kernels. In a classical OpenCL application, it’s not trivial to handle all these issues. For this reason, in OpenCAL-CL we decided to introduce an intermediate object (<code>CALCLToolkit</code>) that hides to the user the memory management and handles the compiled chosen kernels. The user must only initialize the <code>CALCLToolkit</code> structure(line 62) and choose which kernel he/she wants to execute by the function <code>calclGetKernelFromProgram</code> specifying the program and kernel names. Once kernels are chosen to be executed, the user must only call the function <code>calclAddElementaryProcessKernel2D</code> (line 71) that adds the specific kernel elementary process inside the <code>CALCLToolkit</code> structure. Finally, to run the simulation the user must call the function <code>calclRun2D</code> specifying the initial and final step.</p>
<p>Below is reported the device side code of Conway’s Game of Life. In this application, we have only one elementary process, defined as a kernel called <code>life_transition_function</code>.<br />
The OpenCAL-CL library will launch exactly a number of threads equal to the entire cellular space, structured like a matrix. In this way, every cell can perform in parallel its own computation. To access to the indexes of the cell the user must call the function <code>getRow</code> and <code>getCol</code> line(14-15). Furthermore, the user can use the function <code>get_columns</code> and <code>get_rows</code> (line 17-18) to retrieve the dimensions of the cellular space. In the specific case of the Game of Life, we used the <code>calGet2Di</code> function to get the central cell’s value of the sub- state Q (remember that the central cell is identified by the coordinates (i, j)), the <code>calGetX2Di</code> function to retrieve the value of the n-th neighbour’s substate Q, and the <code>calSet2Di</code> function to update the value of the substate Q for the central cell.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Indeed, even 1D CA can be defined oas a degenerate case of 2D, or even 3D, CA.<a href="#fnref1">↩</a></p></li>
<li id="fn2"><p>The clang C compiler can also be used, taking in mind that it still does not fully support Open-MP natively.<a href="#fnref2">↩</a></p></li>
<li id="fn3"><p>For a list of OpenMP compliant compilers see the following link: <a href="http://openmp.org/wp/openmp-compilers/" class="uri">http://openmp.org/wp/openmp-compilers/</a>.<a href="#fnref3">↩</a></p></li>
<li id="fn4"><p>OpenCAL-CL was tested on the NVIDIA OpenCL implementation. You can find the NVIDIA’s OpenCL implementation, shipped with the CUDA platform, at the following url: <a href="http://docs.nvidia.com/cuda/#installation-guides" class="uri">http://docs.nvidia.com/cuda/#installation-guides</a><a href="#fnref4">↩</a></p></li>
<li id="fn5"><p>For example <code>freeglut-devel</code> or <code>freeglut3-dev</code> packages on <code>yum/dnf</code>- and <code>apt</code>-based systems, respectively.<a href="#fnref5">↩</a></p></li>
<li id="fn6"><p>XCA are also known as Complex Cellular Automata (CCA), Macroscopic Cellular Automata (MCA), and Multicomponent Cellular Automata (MCA)<a href="#fnref6">↩</a></p></li>
<li id="fn7"><p>Note that, in the case of the Game of Life CA, the central cell does not belong to its own neighborhood.<a href="#fnref7">↩</a></p></li>
<li id="fn8"><p>This characteristic is colled <em>implicit parallelism</em> and is obtained in OpenCAL by considering two different working planes for each registered substate, namely <em>current</em> and <em>next</em> working planes. However, behind the scene, cells are updated sequentially, being OpenCAL a serial software. The <em>current</em> working plane is used to read current values for the cells, while the <em>next</em> working plane is used to write updated values. The <em>implicit parallelism</em> is also used in the parallel versions of OpenCAL, with the difference that more than one cell can be processed and updated concurrently by exploiting mre than one processing unit.<a href="#fnref8">↩</a></p></li>
<li id="fn9"><p>On the serial version of OpenCAL, implicit parallelism is obtained by exploiting the two different computing planes built into OpenCAL’s substates. The first one, that we will call <em>current</em>, is used to read substates’s values for the central cell and its neighbours, while the second, that we will call <em>next</em>, is used to update the new state for the central cell. When all the cells have been processed and the new state values updated, computing planes are switched, i.e. the <em>next</em> plane is assumed as <em>current</em> and the <em>current</em> as <em>next</em>, and the process is reiterated. In this manner, the <em>current</em> computing plane is not <em>corrupted</em> during a computational step, being new values written to the <em>next</em> plane. Note that, even in the case more processing units are used to compute the next CA state and more cells are updated simltaneously, which is the case of OpenCAL-OMP and OpenCAL-CL, the two computing planes are manteined.<a href="#fnref9">↩</a></p></li>
<li id="fn10"><p>Remember that, by default, substates are updated after the application of each alementary process.<a href="#fnref10">↩</a></p></li>
<li id="fn11"><p>This is due to the fact that a substates’ update is performed after the first elementary process has been applied to all the (active) cells of the cellular space. This behaviour is set by menas of the <code>CAL_UPDATE_IMPLICIT</code> parameter used in the definition of the simulation object at line 132 of Listing [lst:cal_sciddicaT-unsafe].<a href="#fnref11">↩</a></p></li>
<li id="fn12"><p>Actually, only <span class="math inline">\(Q.h\)</span> can be update by the transition function, since <span class="math inline">\(Q.z\)</span> is a single-lyered substate.<a href="#fnref12">↩</a></p></li>
<li id="fn13"><p>Note that active cells are updated first otherwise the subsequent substate update could neglect some cells that have become active during the current step. For instance, inactive cells can receive a flow and become active at he current step of computation. If the set of active cells is not updated before any other substates, those new cells will still be considered inactive during the current step and their value will not be updated, by losing debris flow mass.<a href="#fnref13">↩</a></p></li>
</ol>
</div>
</body>
</html>