<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6029867844615683735</id><updated>2012-02-14T00:16:33.637+08:00</updated><category term='parallel processing'/><category term='math'/><category term='parallel programming'/><category term='code'/><title type='text'>right side of wrong</title><subtitle type='html'>CRight CHeart::rightSideOfWrong(CWrong aWrong)</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>6</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-1370754469363114919</id><published>2008-05-22T18:53:00.009+08:00</published><updated>2010-05-21T00:18:51.662+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='code'/><category scheme='http://www.blogger.com/atom/ns#' term='math'/><title type='text'>algorithm</title><content type='html'>some quotes from &lt;a href="http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/CourseHome/index.htm"&gt;algorithm class at mit&lt;/a&gt;.&lt;br /&gt;&lt;blockquote&gt;you wanna be a good programmer?&lt;br /&gt;you just program every day for two years, you will be an excellent programmer.&lt;br /&gt;&lt;br /&gt;you wanna be a world class programmer?&lt;br /&gt;you can program every day for ten years or,&lt;br /&gt;you can program every day for two years and take an algorithm class.&lt;br /&gt;&lt;br /&gt;-- charles e. leiserson&lt;/blockquote&gt;&lt;br /&gt;&lt;blockquote&gt;we have both balance our mathematical understanding with our engineering common sense.&lt;br /&gt;&lt;br /&gt;-- charles e. leiserson&lt;/blockquote&gt;&lt;br /&gt;it's not a common &lt;span style="font-style: italic;"&gt;algorithm class&lt;/span&gt;, at least for me! i did not even learn about &lt;a href="http://en.wikibooks.org/wiki/Data_Structures/Asymptotic_Notation"&gt;&lt;span&gt;asymptotic notation&lt;/span&gt;&lt;/a&gt; when i was getting my &lt;span style="font-style: italic;"&gt;algorithm and programming&lt;/span&gt; class.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-1370754469363114919?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/1370754469363114919/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=1370754469363114919' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/1370754469363114919'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/1370754469363114919'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2008/05/algorithm.html' title='algorithm'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-5024916095885023493</id><published>2008-04-26T14:31:00.015+08:00</published><updated>2008-05-06T22:43:06.675+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='code'/><category scheme='http://www.blogger.com/atom/ns#' term='math'/><title type='text'>octave: speed optimization - fix matrix looping (080426)</title><content type='html'>with or without following commands on following &lt;a href="http://www.octave.org/"&gt;octave&lt;/a&gt; codes will effect obviously different execution time. from with/without preallocated memory, changing looping sequence, vectorizing, until using specific function will produced different execution time. i have eight cases to be compared, they are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 1&lt;/span&gt;: without preallocated memory&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m1(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 39.3089 seconds.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 2&lt;/span&gt;: preallocated memory with 'm = [];'&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m2 = [];&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m2(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 38.7618 seconds.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 3&lt;/span&gt;: preallocated memory with 'm(N, P) = 1;'&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m3(N, P) = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m3(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 0.126078 second.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 4&lt;/span&gt;: preallocated memory with 'm = ones(N, P);'&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m4 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m4(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 0.082219 second.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 5&lt;/span&gt;: changing the looping sequence from 'for i = 1:N' to 'for i = N:-1:1'&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;thanks to &lt;span style="font-style: italic;"&gt;ben abbott&lt;/span&gt; for &lt;a href="http://www.cae.wisc.edu/pipermail/help-octave/2008-May/009137.html"&gt;his answer&lt;/a&gt; to my question related to this optimization. he also suggest me to compare with following code:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m5 = [];&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = N:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m5(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 0.130076 second.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;thanks to &lt;span style="font-style: italic;"&gt;przemek klosowski&lt;/span&gt; for following new cases (case 6 to 8) and idea to add the conclusion.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 6&lt;/span&gt;: nested or more looping with preallocated memory&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m6 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = N:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = P:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        m6(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 15.0053 seconds.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 7&lt;/span&gt;: vectorizing using looping&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m7 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m7(:,&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;) = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;/pre&gt;elapsed time is 0.053149 second.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span&gt;case 8&lt;/span&gt;: vectorizing using 'repmat' function&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m8 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;repmat&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;((&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N)', &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;/pre&gt;elapsed time is 0.045707 second.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;conclusion&lt;/li&gt;&lt;/ul&gt;&lt;ol&gt;&lt;li&gt;from case 1-4: one way to speed up looping execution is with preallocated memory.&lt;/li&gt;&lt;li&gt;from case 5: another way to speed up looping execution is with changing the looping sequence.&lt;/li&gt;&lt;li&gt;from case 6: looping is very time consuming. so,  avoid of using looping with other commands. this can be different from  case to case.&lt;/li&gt;&lt;li&gt;from case 7: another way to speed up looping execution is with vectorizing. vectorizing can be faster than preallocated memory or changing the looping sequence.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;from case 8: another way to speed up looping execution is using specific function. using specific function can be faster than vectorizing, preallocated memory, or changing the looping sequence.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;ul&gt;&lt;li&gt;overall test code&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# Copyright (C) 2008 by lain.ux&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# l411v.ux@gmail.com&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# www.l411v.com&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;#&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# This program is free software; you can redistribute it and/or modify &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# it under the terms of the GNU General Public License as published by &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# the Free Software Foundation; version 2 of the License.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;#&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# This program is distributed in the hope that it will be useful, &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# but WITHOUT ANY WARRANTY; without even the implied warranty of &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# GNU General Public License for more details.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;#&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# You should have received a copy of the GNU General Public License &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# along with this program; if not, write to the &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# Free Software Foundation, Inc., &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------[ define matrix size ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;N = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;900&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;P = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1800&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"matrix size: m(%d,%d)\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# -------------------------------------------[ display result function ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;function&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; display_result(m, N, P)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"result:\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# checking first column (expected: 1, 1, 1, ..., 1)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[1,1] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[1,2] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[1,3] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;3&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[1,%d] = %d\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,P));&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# checking second column (expected: 2, 2, 2, ..., 2)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[2,1] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[2,2] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[2,3] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;3&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[2,%d] = %d\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P, m(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,P));&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# checking Nth column (expected: 900, 900, 900, ..., 900)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[%d,1] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, m(N,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[%d,2] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, m(N,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[%d,3] = %d\t"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, m(N,&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;3&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"m[%d,%d] = %d\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, N, P, m(N,P));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# each column of matrix is the sequence 1, 2, ..., N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 1 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 1: without preallocated memory\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m1(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m1, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 2 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 2: preallocated memory with 'm = [];'\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m2 = [];&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m2(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m2, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 3 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 3: preallocated memory with 'm(N, P) = 1;'\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m3(N, P) = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m3(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m3, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 4 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 4: preallocated memory with 'm = ones(N, P);'\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m4 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m4(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m4, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 5 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 5: changing the looping sequence "&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"from 'for i = 1:N' to 'for i = N:-1:1'\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m5 = [];&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = N:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m5(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; * &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m5, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 6 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 6: nested or more looping with preallocated memory\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m6 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = N:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = P:-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        m6(&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;) = &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m6, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 7 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 7: vectorizing using looping\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m7 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ones&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:P&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    m7(:,&lt;/span&gt;&lt;span style="color: rgb(178, 140, 0);"&gt;j&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;) = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endfor&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m7, N, P);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(173, 173, 173);"&gt;# ------------------------------------------------------------[ case 8 ]--&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;printf&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"\ncase 8: vectorizing using 'repmat' function\n"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;tic&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;m8 = &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;repmat&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;((&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:N)', &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;toc&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;display_result(m8, N, P);&lt;/span&gt;&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;execution result&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ octave-3.0.1 -q trick_080426.octave&lt;br /&gt;matrix size: m(900,1800)&lt;br /&gt;&lt;br /&gt;case 1: without preallocated memory&lt;br /&gt;Elapsed time is 39.3089 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 2: preallocated memory with 'm = [];'&lt;br /&gt;Elapsed time is 38.7618 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 3: preallocated memory with 'm(N, P) = 1;'&lt;br /&gt;Elapsed time is 0.126078 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 4: preallocated memory with 'm = ones(N, P);'&lt;br /&gt;Elapsed time is 0.082219 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 5: changing the looping sequence from 'for i = 1:N' to 'for i = N:-1:1'&lt;br /&gt;Elapsed time is 0.130076 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 6: nested or more looping with preallocated memory&lt;br /&gt;Elapsed time is 15.0053 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 7: vectorizing using looping&lt;br /&gt;Elapsed time is 0.053149 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;br /&gt;case 8: vectorizing using 'repmat' function&lt;br /&gt;Elapsed time is 0.045707 seconds.&lt;br /&gt;result:&lt;br /&gt;m[1,1] = 1      m[1,2] = 1      m[1,3] = 1      m[1,1800] = 1&lt;br /&gt;m[2,1] = 2      m[2,2] = 2      m[2,3] = 2      m[2,1800] = 2&lt;br /&gt;m[900,1] = 900  m[900,2] = 900  m[900,3] = 900  m[900,1800] = 900&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-5024916095885023493?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/5024916095885023493/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=5024916095885023493' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/5024916095885023493'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/5024916095885023493'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2008/04/octave-speed-optimization-080426.html' title='octave: speed optimization - fix matrix looping (080426)'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-3399308274730526667</id><published>2008-04-18T09:52:00.007+08:00</published><updated>2008-12-10T18:42:14.486+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel programming'/><title type='text'>openmp c/c++ on intel and linux using sun compiler</title><content type='html'>i had written another &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; topic titled &lt;a href="http://l411v.blogspot.com/2008/04/openmp-on-intel-and-linux.html"&gt;openmp c/c++ on intel and linux using intel compiler&lt;/a&gt;. this blog is about how to do it again using &lt;span style="font-style: italic;"&gt;sun&lt;/span&gt; compiler. following is the steps:&lt;br /&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;setting &lt;span style="font-style: italic;"&gt;sunstudio&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-style: italic;"&gt;sunstudio&lt;/span&gt; is a development packet from &lt;span style="font-style: italic;"&gt;sun microsystems&lt;/span&gt;. it contains &lt;span style="font-style: italic;"&gt;netbeans ide&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;sun c compiler&lt;/span&gt;. we will need both of them to create the project and compile the code. my another blog titled &lt;a href="http://l411v.blogspot.com/2008/04/sun-studio-on-linux.html"&gt;sun studio on linux&lt;/a&gt; describes how to set up the &lt;span style="font-style: italic;"&gt;sunstudio&lt;/span&gt; on &lt;span style="font-style: italic;"&gt;debian gnu/linux&lt;/span&gt;. at the end of the blog, i give two links to &lt;span style="font-style: italic;"&gt;sunstudio&lt;/span&gt; documentations on how to use it.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;getting source code&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;i use a same source code as i use it on &lt;span style="font-style: italic;"&gt;intel&lt;/span&gt; compiler. please find the source code from another blog titled &lt;a href="http://l411v.blogspot.com/2008/04/openmp-on-intel-and-linux.html"&gt;openmp c/c++ on intel and linux using intel compiler&lt;/a&gt; on &lt;span style="font-style: italic;"&gt;sample code&lt;/span&gt; bulleted list.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;compiling source code&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;after setting up the project and loading the source code in, follow following steps to activate &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature during compilation process.&lt;br /&gt;&lt;ul&gt;&lt;ul&gt;&lt;li&gt;open &lt;span style="font-style: italic;"&gt;project properties&lt;/span&gt; window: on &lt;span style="font-style: italic;"&gt;project&lt;/span&gt; window, right click on project name, and select &lt;span style="font-style: italic;"&gt;properties&lt;/span&gt;.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;activate &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature: on &lt;span style="font-style: italic;"&gt;categories&lt;/span&gt; tree box, select &lt;span style="font-style: italic;"&gt;configuration properties&lt;/span&gt; -&lt;span style="font-style: italic;"&gt; c/c++/fortran&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;c compiler&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;general&lt;/span&gt;. on &lt;span style="font-style: italic;"&gt;properties&lt;/span&gt; table box, change &lt;span style="font-style: italic;"&gt;multithreading level&lt;/span&gt; form &lt;span style="font-style: italic;"&gt;none&lt;/span&gt; to &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt;. click on &lt;span style="font-style: italic;"&gt;ok&lt;/span&gt; button.&lt;/li&gt;&lt;li&gt;build the project: on menu bar, click on &lt;span style="font-style: italic;"&gt;build&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;build main project&lt;/span&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_e7_RlKSyGls/SAgtdijIc6I/AAAAAAAAAKI/cHaXsU_5xn4/s1600-h/sun+-+openmp+setting.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://1.bp.blogspot.com/_e7_RlKSyGls/SAgtdijIc6I/AAAAAAAAAKI/cHaXsU_5xn4/s320/sun+-+openmp+setting.png" alt="" id="BLOGGER_PHOTO_ID_5190448556058112930" border="0" /&gt;&lt;/a&gt;&lt;span style="font-style: italic;"&gt;activate openmp compilation feature&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;executing the binary file&lt;/li&gt;&lt;/ul&gt;you need to set maximum stack size with following command. i still don't know how to run the following command on the &lt;span style="font-style: italic;"&gt;sunstudio ide&lt;/span&gt; before i execute the binary file. so i execute the binary file on console.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ ulimit -s unlimited&lt;/pre&gt;&lt;br /&gt;you need also to set number of thread to use. i use &lt;span style="font-style: italic;"&gt;intel core 2 duo&lt;/span&gt; processor. it's dual processor so i set it with 2.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ export OMP_NUM_THREADS=2&lt;br /&gt;$ export OMP_DYNAMIC=FALSE&lt;/pre&gt;&lt;br /&gt;i name the binary file with &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature, with &lt;span style="font-family:courier new;"&gt;sun_openmp&lt;/span&gt;. following is the execution result.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ ./sun_openmp&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;We are using 2 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 6.00 sec&lt;br /&gt;Wall clock time/thread = 3.00 sec&lt;br /&gt;MFlops = 2880.000000&lt;/pre&gt;&lt;br /&gt;i name the binary file without &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature, with &lt;span style="font-family:courier new;"&gt;sun_not_openmp&lt;/span&gt;. following is the execution result.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ ./sun_not_openmp&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;We are using 1 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 17.00 sec&lt;br /&gt;Wall clock time/thread = 17.00 sec&lt;br /&gt;MFlops = 1016.470588&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;system monitor&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;following captured &lt;span style="font-style: italic;"&gt;kde system guard (performance monitor)&lt;/span&gt; shows &lt;span style="font-style: italic;"&gt;cpu load&lt;/span&gt; of &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;cpu1&lt;/span&gt;. both numbers on purple color describe:&lt;br /&gt;(1) single processor working: only &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; is active 100%, the process is finished slower&lt;br /&gt;(2) dual processors working: both &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;cpu1&lt;/span&gt; are active 100%, the process is finished faster&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_e7_RlKSyGls/SAgvhijIc7I/AAAAAAAAAKQ/gSCDIVqoOp8/s1600-h/sun+-+system+load+-+edited.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://1.bp.blogspot.com/_e7_RlKSyGls/SAgvhijIc7I/AAAAAAAAAKQ/gSCDIVqoOp8/s320/sun+-+system+load+-+edited.png" alt="" id="BLOGGER_PHOTO_ID_5190450823800845234" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;openmp api user's guide&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;for more deep understanding, please reefer to &lt;a href="http://docs.sun.com/app/docs/doc/819-5270/aewbx?a=browse"&gt;sun studio 12: openmp api user's guide&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-3399308274730526667?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/3399308274730526667/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=3399308274730526667' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/3399308274730526667'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/3399308274730526667'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2008/04/openmp-cc-on-intel-and-linux-using-sun.html' title='openmp c/c++ on intel and linux using sun compiler'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_e7_RlKSyGls/SAgtdijIc6I/AAAAAAAAAKI/cHaXsU_5xn4/s72-c/sun+-+openmp+setting.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-2150644835188122843</id><published>2008-04-02T23:59:00.026+08:00</published><updated>2008-12-10T18:42:15.371+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel programming'/><category scheme='http://www.blogger.com/atom/ns#' term='parallel processing'/><title type='text'>openmp c/c++ on intel and linux using intel compiler</title><content type='html'>i'm using &lt;a href="http://www.intel.com/products/processor/core2duo/index.htm"&gt;intel core 2 duo&lt;/a&gt; processor and &lt;a href="http://www.debian.org/"&gt;debian gnu/linux&lt;/a&gt; 4.0. i'm now, in the step of learning &lt;a href="http://www.openmp.org/"&gt;openmp&lt;/a&gt; and i want to know how fast dual processors can run compared to single processor.&lt;br /&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;hardware and software specification&lt;/li&gt;&lt;/ul&gt;following processor information is taken from &lt;span style="font-family:courier new;"&gt;/proc/cpuinfo&lt;/span&gt; file.&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;vendor_id       : GenuineIntel&lt;br /&gt;cpu family      : 6&lt;br /&gt;model           : 15&lt;br /&gt;model name      : Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz&lt;br /&gt;stepping        : 6&lt;br /&gt;cpu MHz         : 2400.778&lt;br /&gt;cache size      : 4096 KB&lt;br /&gt;physical id     : 0&lt;br /&gt;siblings        : 2&lt;br /&gt;cpu cores       : 2&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;i'm using &lt;a href="http://www.debian.org/"&gt;debian gnu/linux&lt;/a&gt; 4.0.&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ uname -a&lt;br /&gt;Linux l411v 2.6.18-6-686 #1 SMP Sun Feb 10 22:11:31 UTC 2008 i686 GNU/Linux&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;i get and install &lt;a href="http://www.intel.com/cd/software/products/asmo-na/eng/compilers/index.htm"&gt;intel c++ compiler professional edition&lt;/a&gt; for &lt;span style="font-style: italic;"&gt;linux&lt;/span&gt; for non-commercial use.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc --version&lt;br /&gt;icc (ICC) 10.1 20080112&lt;br /&gt;Copyright (C) 1985-2007 Intel Corporation.  All rights reserved.&lt;br /&gt;&lt;/pre&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;setting environment&lt;/li&gt;&lt;/ul&gt;set executable path to intel c++ compiler binary directory.&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ export PATH=$PATH:/usr/share/intel/cc/10.1.012/bin/&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;set library path to intel c++ dynamic library directory.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ export LD_LIBRARY_PATH=/usr/share/intel/cc/10.1.012/lib&lt;br /&gt;$ echo $LD_LIBRARY_PATH&lt;br /&gt;/usr/share/intel/cc/10.1.012/lib&lt;br /&gt;&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;fixing bug&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;i get &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; sample code (&lt;span style="font-family:courier new;"&gt;openmp_sample.c&lt;/span&gt;) after installing &lt;span style="font-style: italic;"&gt;intel&lt;/span&gt; c++ compiler from sample directory. please find the sample code below. to activate &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature, add additional &lt;span style="font-family:courier new;"&gt;-openmp&lt;/span&gt; compiling parameter. i found some errors when compiling the sample code without &lt;span style="font-family:courier new;"&gt;-openmp&lt;/span&gt; compiling parameter. the errors as follow:&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc -std=c99 openmp_sample.c&lt;br /&gt;openmp_sample.c(106): warning #161: unrecognized #pragma&lt;br /&gt;     #pragma omp parallel private(i,j,k)&lt;br /&gt;             ^&lt;br /&gt;&lt;br /&gt;openmp_sample.c(109): warning #161: unrecognized #pragma&lt;br /&gt;      #pragma omp single nowait&lt;br /&gt;              ^&lt;br /&gt;&lt;br /&gt;openmp_sample.c(119): warning #161: unrecognized #pragma&lt;br /&gt;      #pragma omp for nowait&lt;br /&gt;              ^&lt;br /&gt;&lt;br /&gt;openmp_sample.c(126): warning #161: unrecognized #pragma&lt;br /&gt;      #pragma omp for nowait&lt;br /&gt;              ^&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;what i need to do just adding preprocessor to block all &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; features on the sample code. so change from:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;  #pragma omp parallel private(i,j,k)&lt;br /&gt;&lt;/pre&gt;to:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;#ifdef _OPENMP&lt;br /&gt;  #pragma omp parallel private(i,j,k)&lt;br /&gt;#endif&lt;br /&gt;&lt;/pre&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;compile and run&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:courier new;"&gt;ulimit&lt;/span&gt; command controls the user resources available to a process started by the shell. you need set the stack size to an appropriate size; otherwise, the application will generate a segmentation fault. following command sets the maximum stack size to unlimited.&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ ulimit -s unlimited&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;compile with &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature and static linking mode:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc -std=c99 -openmp -static openmp_sample.c&lt;br /&gt;openmp_sample.c(119): (col. 5) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.&lt;br /&gt;openmp_sample.c(126): (col. 5) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.&lt;br /&gt;openmp_sample.c(106): (col. 3) remark: OpenMP DEFINED REGION WAS PARALLELIZED.&lt;br /&gt;&lt;br /&gt;$ ls -l&lt;br /&gt;total 1092&lt;br /&gt;-rwxr-xr-x 1 lain lain 1105135 2008-04-02 11:48 a.out&lt;br /&gt;-rw-r--r-- 1 lain lain    4702 2008-04-02 11:06 openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ./a.out&lt;br /&gt;&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;We are using 2 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 6.00 sec&lt;br /&gt;Wall clock time/thread = 3.00 sec&lt;br /&gt;MFlops = 2880.000000&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;compile with &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature and dynamic linking mode:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc -std=c99 -openmp openmp_sample.c&lt;br /&gt;openmp_sample.c(124): (col. 5) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.&lt;br /&gt;openmp_sample.c(133): (col. 5) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.&lt;br /&gt;openmp_sample.c(107): (col. 3) remark: OpenMP DEFINED REGION WAS PARALLELIZED.&lt;br /&gt;&lt;br /&gt;$ ls -l&lt;br /&gt;total 44&lt;br /&gt;-rwxr-xr-x 1 lain lain 34087 2008-04-02 12:01 a.out&lt;br /&gt;-rw-r--r-- 1 lain lain  4702 2008-04-02 11:06 openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ./a.out&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;We are using 2 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 6.00 sec&lt;br /&gt;Wall clock time/thread = 3.00 sec&lt;br /&gt;MFlops = 2880.000000&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;compile without &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature and static linking mode:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc -std=c99 -static openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ls -l&lt;br /&gt;total 512&lt;br /&gt;-rwxr-xr-x 1 lain lain 509458 2008-04-02 11:49 a.out&lt;br /&gt;-rw-r--r-- 1 lain lain   4702 2008-04-02 11:06 openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ./a.out&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;We are using 1 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 17.00 sec&lt;br /&gt;Wall clock time/thread = 17.00 sec&lt;br /&gt;MFlops = 1016.470588&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;compile without &lt;span style="font-style: italic;"&gt;openmp&lt;/span&gt; feature and dynamic linking mode:&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;$ icc -std=c99 openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ls -l&lt;br /&gt;total 32&lt;br /&gt;-rwxr-xr-x 1 lain lain 23570 2008-04-02 12:03 a.out&lt;br /&gt;-rw-r--r-- 1 lain lain  4702 2008-04-02 11:06 openmp_sample.c&lt;br /&gt;&lt;br /&gt;$ ./a.out&lt;br /&gt;Using time() for wall clock time&lt;br /&gt;Problem size: c(600,2400) = a(600,1200) * b(1200,2400)&lt;br /&gt;Calculating product 5 time(s)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;We are using 1 thread(s)&lt;br /&gt;&lt;br /&gt;Finished calculations.&lt;br /&gt;Matmul kernel wall clock time = 17.00 sec&lt;br /&gt;Wall clock time/thread = 17.00 sec&lt;br /&gt;MFlops = 1016.470588&lt;br /&gt;&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;conclusion&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;| openmp  | number of | linking | file size |  time to  |    mega      |&lt;br /&gt;| feature |  thread   |  mode   |  (byte)   |  finish   |    flops     |&lt;br /&gt;|         |           |         |           | (seconds) |              |&lt;br /&gt;+---------+-----------+---------+-----------+-----------+--------------+&lt;br /&gt;|   yes   |     2     | static  | 1,105,135 |     6     | 2,880.000000 |&lt;br /&gt;|   yes   |     2     | dynamic |    34,087 |     6     | 2,880.000000 |&lt;br /&gt;|   no    |     1     | static  |   509,458 |    17     | 1,016.470588 |&lt;br /&gt;|   no    |     1     | dynamic |    23,570 |    17     | 1,016.470588 |&lt;br /&gt;&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;system monitor&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;following captured &lt;span style="font-style: italic;"&gt;kde system guard (performance monitor)&lt;/span&gt; shows &lt;span style="font-style: italic;"&gt;cpu load&lt;/span&gt; of &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;cpu1&lt;/span&gt;. both numbers on purple color describe:&lt;br /&gt;(1) single processor working: only &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; is active 100%, the process is finished slower&lt;br /&gt;(2) dual processors working: both &lt;span style="font-style: italic;"&gt;cpu0&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;cpu1&lt;/span&gt; are active 100%, the process is finished faster&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_e7_RlKSyGls/R_3b3DZxzUI/AAAAAAAAAJA/U5IEIUhKEwE/s1600-h/openmp+-+system+load+%28kde%29+-+edited+small.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://1.bp.blogspot.com/_e7_RlKSyGls/R_3b3DZxzUI/AAAAAAAAAJA/U5IEIUhKEwE/s320/openmp+-+system+load+%28kde%29+-+edited+small.png" alt="" id="BLOGGER_PHOTO_ID_5187544084653395266" border="0" /&gt;&lt;/a&gt;&lt;ul style="font-weight: bold;"&gt;&lt;li&gt;sample code&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:courier new;"&gt;openmp_sample.c&lt;/span&gt; file:&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 0.5em; overflow: auto;"&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;/*&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Copyright (C) 2006-2007 Intel Corporation. All Rights Reserved.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * The source code contained or described herein and all&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * documents related to the source code ("Material") are owned by&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Intel Corporation or its suppliers or licensors. Title to the&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Material remains with Intel Corporation or its suppliers and&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * licensors. The Material is protected by worldwide copyright&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * laws and treaty provisions.  No part of the Material may be&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * used, copied, reproduced, modified, published, uploaded,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * posted, transmitted, distributed,  or disclosed in any way&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * except as expressly provided in the license provided with the&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Materials.  No license under any patent, copyright, trade&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * secret or other intellectual property right is granted to or&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * conferred upon you by disclosure or delivery of the Materials,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * either expressly, by implication, inducement, estoppel or&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * otherwise, except as expressly provided in the license&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * provided with the Materials.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * [DESCRIPTION]&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Each element of the product matrix c[i][j] is &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * computed from a unique row and&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * column of the factor matrices, a[i][k] and b[k][j].&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * In the multithreaded implementation, each thread can&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * concurrently compute some submatrix of the product without&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * needing OpenMP data or control synchronization.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * The algorithm uses OpenMP* to parallelize the outer-most loop,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * using the "i" row index.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Both the outer-most "i" loop and middle "k" loop are manually&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * unrolled by 4.  The inner-most "j" loop iterates one-by-one&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * over the columns of the product and factor matrices.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * [COMPILE]&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Use the following compiler options to compile both multi- and &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * single-threaded versions.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Parallel compilation:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  You must set the stacksize to an appropriate size; otherwise,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  the application will generate a segmentation fault. &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  Linux* and Mac OS* X: appropriate ulimit commands are shown for &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  bash shell.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  Windows*: /Qstd=c99 /Qopenmp /F256000000&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  Linux*:   ulimit -s unlimited&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *            -std=c99 -openmp&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * &lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  Mac OS* X:  ulimit -s 64000&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *            -std=c99 -openmp&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; * Serial compilation:&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  Use the same command, but omit the -openmp (Linux and Mac OS X)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *  or /Qopenmp (Windows) option.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; *&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt; */&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#include &amp;lt;stdio.h&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#include &amp;lt;time.h&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#include &amp;lt;float.h&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#include &amp;lt;math.h&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#include &amp;lt;omp.h&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define bool _Bool&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define true 1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define false 0&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// Matrix size constants&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// Be careful to set your shell's stacksize limit to a high value if you&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// wish to increase the SIZE.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define SIZE     4800     // Must be a multiple of 8.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define M        SIZE/8&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define N        SIZE/4&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define P        SIZE/2&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#define NTIMES   5        // product matrix calculations&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;int&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; main(&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;void&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;double&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; a[M][N], b[N][P], c[M][P], walltime;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  bool nthr_checked=false;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  time_t start;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;int&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; i, j, k, l, i1, i2, i3, k1, k2, k3, nthr=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"Using time() for wall clock time&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"Problem size: c(%d,%d) = a(%d,%d) * b(%d,%d)&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         M, P, M, N, N, P);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"Calculating product %d time(s)&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, NTIMES);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// a is identity matrix&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (i=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; i&amp;lt;M; i++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (j=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; j&amp;lt;N; j++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      a[i][j] = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1.0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// each column of b is the sequence 1,2,...,N&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (i=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; i&amp;lt;N; i++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (j=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; j&amp;lt;P; j++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      b[i][j] = i+&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1.&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  start = time(NULL);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#pragma omp parallel private(i,j,k)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (l=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; l&amp;lt;NTIMES; l++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#pragma omp single nowait&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;if&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (!nthr_checked) {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      nthr = omp_get_num_threads();&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      printf( &lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;We are using %d thread(s)&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, nthr);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      nthr_checked = true;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// Initialize product matrix&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#pragma omp for nowait&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (i=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; i&amp;lt;M; i++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (j=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; j&amp;lt;P; j++)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        c[i][j] = &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0.0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// Parallelize by row.  The threads don't need to synchronize at&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// loop end, so "nowait" can be used.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#ifdef _OPENMP&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#pragma omp for nowait&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;#endif&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (i=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; i&amp;lt;M; i++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (k=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; k&amp;lt;N; k++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// Each element of the product is just the sum 1+2+...+n&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (j=&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;; j&amp;lt;P; j++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;          c[i][j]  += a[i][k]  * b[k][j];&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;        }&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      }&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    }&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  } &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// #pragma omp parallel private(i,j,k)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  } &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// l=0,...NTIMES-1&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  walltime = time(NULL) - start;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;Finished calculations.&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"Matmul kernel wall clock time = %.2f sec&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, walltime);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"Wall clock time/thread = %.2f sec&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;, walltime/nthr);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  printf(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"MFlops = %f&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;\n&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      (&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;double&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;)(NTIMES)*(&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;double&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;)(N*M*&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;2&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;)*(&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;double&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;)(P)/walltime/&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1.0e6&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;  &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;return&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-2150644835188122843?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/2150644835188122843/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=2150644835188122843' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/2150644835188122843'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/2150644835188122843'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2008/04/openmp-on-intel-and-linux.html' title='openmp c/c++ on intel and linux using intel compiler'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_e7_RlKSyGls/R_3b3DZxzUI/AAAAAAAAAJA/U5IEIUhKEwE/s72-c/openmp+-+system+load+%28kde%29+-+edited+small.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-7620007183726965347</id><published>2008-03-29T14:41:00.023+08:00</published><updated>2008-12-10T18:42:16.937+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='code'/><title type='text'>syntax highlighting on blog</title><content type='html'>since &lt;a href="http://code.google.com/p/syntaxhighlighter/"&gt;syntax highlighter&lt;/a&gt; doesn't support &lt;span style="font-style: italic;"&gt;vhdl&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;verilog&lt;/span&gt; code yet, so i manually convert the &lt;span style="font-style: italic;"&gt;highlighted text file&lt;/span&gt; into &lt;span style="font-style: italic;"&gt;html file&lt;/span&gt;. following the steps:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;open your code using &lt;a href="http://en.wikipedia.org/wiki/KWrite"&gt;kwrite&lt;/a&gt; and set the highlighting color as your flavor by selecting "Settings &gt; Configure Editor..." and clicking on "Fonts &amp;amp; Colors"&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_e7_RlKSyGls/R-4oGG8N4FI/AAAAAAAAAHw/KsF2lZ6gM5M/s1600-h/img1_edit.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://2.bp.blogspot.com/_e7_RlKSyGls/R-4oGG8N4FI/AAAAAAAAAHw/KsF2lZ6gM5M/s320/img1_edit.png" alt="" id="BLOGGER_PHOTO_ID_5183124306557526098" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;export your code into &lt;span style="font-style: italic;"&gt;html file&lt;/span&gt; by clicking on "File &gt; Export as HTML..." and "Save"&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_e7_RlKSyGls/R-3qFW8N4EI/AAAAAAAAAHo/Jt8JysJUdcg/s1600-h/img2_edit.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://2.bp.blogspot.com/_e7_RlKSyGls/R-3qFW8N4EI/AAAAAAAAAHo/Jt8JysJUdcg/s320/img2_edit.png" alt="" id="BLOGGER_PHOTO_ID_5183056123951702082" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;copy the &lt;span style="font-style: italic;"&gt;html source&lt;/span&gt; of your &lt;span style="font-style: italic;"&gt;html file&lt;/span&gt; by opening your &lt;span style="font-style: italic;"&gt;html file&lt;/span&gt; with your internet browser and clicking on "View &gt; Page Source" (if you are using &lt;a href="http://en.wikipedia.org/wiki/Mozilla_Firefox"&gt;mozilla firefox&lt;/a&gt;). copy from tag &amp;lt;pre&amp;gt; to &amp;lt;/pre&amp;gt;.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_e7_RlKSyGls/R-4pWW8N4GI/AAAAAAAAAH4/C-3fFd_D31w/s1600-h/img3_edit.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://3.bp.blogspot.com/_e7_RlKSyGls/R-4pWW8N4GI/AAAAAAAAAH4/C-3fFd_D31w/s320/img3_edit.png" alt="" id="BLOGGER_PHOTO_ID_5183125685242028130" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_e7_RlKSyGls/R-4qo28N4HI/AAAAAAAAAIA/UjQNrIjVUo4/s1600-h/img4_edit.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://1.bp.blogspot.com/_e7_RlKSyGls/R-4qo28N4HI/AAAAAAAAAIA/UjQNrIjVUo4/s320/img4_edit.png" alt="" id="BLOGGER_PHOTO_ID_5183127102581235826" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;paste it into your new posting blog in side "Edit Html" tab box and click back on "Compose" tab box to see the result.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_e7_RlKSyGls/R-4tFW8N4II/AAAAAAAAAII/lkbhTfZ32vg/s1600-h/img5_edit.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://3.bp.blogspot.com/_e7_RlKSyGls/R-4tFW8N4II/AAAAAAAAAII/lkbhTfZ32vg/s320/img5_edit.png" alt="" id="BLOGGER_PHOTO_ID_5183129791230763138" border="0" /&gt;&lt;/a&gt;&lt;ul&gt;&lt;li&gt;add border to make it beautiful by changing &lt;span style="font-style: italic;"&gt;html&lt;/span&gt; tag &amp;lt;pre&amp;gt; with following settings.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 1em; overflow: auto; background-color: rgb(0, 0, 0);"&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;&amp;lt;pre&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt; style=&lt;/span&gt;&lt;span style="color: rgb(170, 0, 0);"&gt;"border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 1em; overflow: auto; background-color: rgb(0, 0, 0);"&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;ul&gt;&lt;li&gt;done! following examples of highlighted &lt;span style="font-style: italic;"&gt;vhdl&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;verilog&lt;/span&gt; code&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 1em; overflow: auto; background-color: rgb(0, 0, 0);"&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;library&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; IEEE&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;use&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; IEEE&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;STD_LOGIC_1164&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ALL&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;use&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; IEEE&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;STD_LOGIC_ARITH&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ALL&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;use&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; IEEE&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;STD_LOGIC_UNSIGNED&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;.&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;ALL&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;entity&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Const_Unit &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;is&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   Port &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;Imm &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;in&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;std_logic_vector&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;15&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         CS &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;in&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;std_logic&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         Const &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;out&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;std_logic_vector&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;31&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;));&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Const_Unit&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;architecture&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Behavioral &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;of&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Const_Unit &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;is&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;begin&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;-- pass the 16 lsb value.&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   Const&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;15&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;)&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;&amp;lt;=&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Imm&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;15&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;process&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;CS&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;,&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Imm&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;)&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;begin&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;if&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;CS &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;=&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;'0'&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;)&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;then&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;-- unsign value&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         Const&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;31&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;16&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;)&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;&amp;lt;=&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; X&lt;/span&gt;&lt;span style="color: rgb(255, 0, 255);"&gt;"0000"&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;else&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;-- sign value (sign extension)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;for&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; i &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;in&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;31&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;downto&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;16&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;loop&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;            Const&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;i&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;)&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;&amp;lt;=&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Imm&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;(&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;15&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;loop&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;if&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;process&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; Behavioral&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-style: italic;"&gt;code: highlighted vhdl code&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;pre style="border: 1px dashed rgb(64, 64, 64); margin: 0em; padding: 1em; overflow: auto; background-color: rgb(0, 0, 0);"&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;`timescale 1ns / 1ns&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;`include &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;"../inc/ctr.h"&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;module&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; ctr(&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// input&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   clk, rst,&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// output&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   out&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;);&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// i/os&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;input&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;                      clk,        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// clock&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;                              rst;        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// reset&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;output&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; [&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;`WIDTH&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;]        out;        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// output&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// internal signals&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;wire&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;                       clk,        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// clock&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;                              rst;        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// reset&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;reg&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;    [&lt;/span&gt;&lt;span style="color: rgb(0, 255, 64);"&gt;`WIDTH&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;-&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;:&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;]        out;        &lt;/span&gt;&lt;span style="color: rgb(221, 221, 221);"&gt;// output&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;always&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; @(&lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;posedge&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; clk &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;or&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;posedge&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; rst) &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;begin&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;if&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt; (rst)&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         out &amp;lt;= &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;0&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;      &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;else&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;         out &amp;lt;= out + &lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;1&lt;/span&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 255, 51);"&gt;   &lt;/span&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;end&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(255, 255, 0);"&gt;endmodule&lt;/span&gt;&lt;/pre&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-style: italic;"&gt;code: highlighted verilog code&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-7620007183726965347?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/7620007183726965347/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=7620007183726965347' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/7620007183726965347'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/7620007183726965347'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2008/03/syntax-highlighting-on-blog.html' title='syntax highlighting on blog'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_e7_RlKSyGls/R-4oGG8N4FI/AAAAAAAAAHw/KsF2lZ6gM5M/s72-c/img1_edit.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6029867844615683735.post-1669280864964517815</id><published>2007-11-24T22:07:00.002+08:00</published><updated>2010-05-21T00:36:31.018+08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='parallel processing'/><title type='text'>opensparc for hobby (setting ide)</title><content type='html'>the first time i knew and directly downloaded &lt;a href="http://www.opensparc.net/"&gt;&lt;span style="font-style: italic;"&gt;opensparc&lt;/span&gt;&lt;/a&gt; source code was eight months ago. i grab it just for hobby so i don't have any plan to buy any expensive &lt;span style="font-style: italic;"&gt;ide&lt;/span&gt;, simulator, synthesizer, or loader. i don't even want to put it into a die. i want to grab it using open source or free software and load it (part of it) into an &lt;span style="font-style: italic;"&gt;fpga&lt;/span&gt; board.&lt;br /&gt;&lt;br /&gt;what did i do so far are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;joining related project&lt;/li&gt;&lt;/ul&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;i join &lt;span style="font-style: italic;"&gt;opensparc-t1&lt;/span&gt; &lt;a href="http://s1.sunsource.net/"&gt;&lt;span style="font-style: italic;"&gt;s1&lt;/span&gt;&lt;/a&gt; project and &lt;a href="http://fpga.sunsource.net/"&gt;&lt;span style="font-style: italic;"&gt;fpga&lt;/span&gt;&lt;/a&gt; project.&lt;br /&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;ul&gt;&lt;li&gt;finding open source software&lt;/li&gt;&lt;/ul&gt;i'm currently using &lt;a href="http://www.debian.org/"&gt;&lt;span style="font-style: italic;"&gt;debian&lt;/span&gt; &lt;span style="font-style: italic;"&gt;gnu&lt;/span&gt;/&lt;span style="font-style: italic;"&gt;linux&lt;/span&gt; 4.0&lt;/a&gt; for the operating system, &lt;a href="http://www.eclipse.org/"&gt;&lt;span style="font-style: italic;"&gt;eclipse&lt;/span&gt; 3.3.1.1&lt;/a&gt; and &lt;a href="http://veditor.sourceforge.net/"&gt;&lt;span style="font-style: italic;"&gt;verilog editor&lt;/span&gt; 0.5.2&lt;/a&gt; for the &lt;span style="font-style: italic;"&gt;ide&lt;/span&gt;, &lt;a href="http://www.icarus.com/eda/verilog/"&gt;&lt;span style="font-style: italic;"&gt;icarus verilog 0.8&lt;/span&gt;&lt;/a&gt; for the simulator, &lt;a href="http://home.nc.rr.com/gtkwave/"&gt;&lt;span style="font-style: italic;"&gt;gtkwave&lt;/span&gt; 1.3.81&lt;/a&gt; for the waveform viewer of the simulation result. for the synthesizer, i have tried &lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; using &lt;span style="font-style: italic;"&gt;icarus verilog&lt;/span&gt; but it fail.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_e7_RlKSyGls/R0hyCIsBgQI/AAAAAAAAAGg/hDIgof6ldKo/s1600-h/s1_core_eclipse.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://4.bp.blogspot.com/_e7_RlKSyGls/R0hyCIsBgQI/AAAAAAAAAGg/hDIgof6ldKo/s320/s1_core_eclipse.png" alt="" id="BLOGGER_PHOTO_ID_5136480756033487106" border="0" /&gt;&lt;/a&gt;&lt;span style="font-style: italic;"&gt;eclipse ide for verilog editor on debian gnu/linux 4.0&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;i have added &lt;span style="font-style: italic;"&gt;opensparct1.1.5&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; as &lt;span style="font-style: italic;"&gt;eclipse&lt;/span&gt; project. i also set the &lt;span style="font-style: italic;"&gt;eclipse&lt;/span&gt; to automatically run some external tools for &lt;span style="font-style: italic;"&gt;s1&lt;/span&gt; project to build &lt;span style="font-style: italic;"&gt;icarus&lt;/span&gt; simulation, run &lt;span style="font-style: italic;"&gt;icarus&lt;/span&gt; simulation, and display test waveform.&lt;br /&gt;&lt;br /&gt;in the following example, i extract the &lt;span style="font-style: italic;"&gt;s1&lt;/span&gt;&lt;span style="font-style: italic;"&gt; core&lt;/span&gt; under &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core/&lt;/span&gt; directory.&lt;br /&gt;&lt;br /&gt;following steps show you how to set the external tools to build &lt;span style="font-style: italic;"&gt;icarus&lt;/span&gt; simulation:&lt;br /&gt;+ select &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;open external tools dialog...&lt;/span&gt;&lt;br /&gt;+ &lt;span&gt;on the &lt;/span&gt;left box, double click &lt;span style="font-style: italic;"&gt;program&lt;/span&gt;&lt;br /&gt;+ on the right box, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- name: s1 - build &lt;span&gt;icarus&lt;/span&gt; simulation&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;main&lt;/span&gt; tab, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- location: &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core/tools/bin/build_icarus&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location of &lt;span style="font-style: italic;"&gt;build_icarus&lt;/span&gt; script)&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- working directory: ${workspace_loc:/s1_core}&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(&lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; working directory)&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;environment&lt;/span&gt; tab:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- click on &lt;span style="font-style: italic;"&gt;new...&lt;/span&gt; button&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- fill in variable name with &lt;span style="font-style: italic;"&gt;FILELIST_ICARUS&lt;/span&gt;, value with &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core/hdl/filelist.icarus&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location to &lt;span style="font-style: italic;"&gt;filelist.icarus&lt;/span&gt; file)&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- click on &lt;span style="font-style: italic;"&gt;new...&lt;/span&gt; button again&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- fill in variable name with &lt;span style="font-style: italic;"&gt;S1_ROOT&lt;/span&gt;, value with &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location to &lt;span style="font-style: italic;"&gt;filelist.icarus&lt;/span&gt; file)&lt;br /&gt;+ please make sure that the directories and environment variables are correct. one mistake can destroy your data. it because the &lt;span style="font-style: italic;"&gt;s1&lt;/span&gt; script execute "rm -rf *" command.&lt;br /&gt;+ click on &lt;span style="font-style: italic;"&gt;apply&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;close&lt;/span&gt; button&lt;br /&gt;+ the new setting will appear at &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;s1 - build icarus simulation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;following steps show you how to set the external tools to run &lt;span style="font-style: italic;"&gt;icarus&lt;/span&gt; simulation:&lt;br /&gt;+ select &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;open external tools dialog...&lt;/span&gt;&lt;br /&gt;+ &lt;span&gt;on the &lt;/span&gt;left box, double click on &lt;span style="font-style: italic;"&gt;program&lt;/span&gt;&lt;br /&gt;+ on the right box, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- name: s1 - run &lt;span style="font-style: italic;"&gt;icarus&lt;/span&gt; simulation&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;main&lt;/span&gt; tab, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- location: &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core/tools/bin/run_icarus&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location to &lt;span style="font-style: italic;"&gt;run_icarus&lt;/span&gt; script)&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- working directory: ${workspace_loc:/s1_core}&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(&lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; working directory)&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;environment&lt;/span&gt; tab:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- click on &lt;span style="font-style: italic;"&gt;new...&lt;/span&gt; button&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- fill in variable name with &lt;span style="font-style: italic;"&gt;S1_ROOT&lt;/span&gt;, value with &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location to &lt;span style="font-style: italic;"&gt;filelist.icarus&lt;/span&gt; file)&lt;br /&gt;+ click on &lt;span style="font-style: italic;"&gt;apply&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;close&lt;/span&gt; button&lt;br /&gt;+ the new setting will appear at &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;s1 - run icarus simulation&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;following steps show you how to set the external tools to display test waveform:&lt;br /&gt;+ select &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;open external tools dialog...&lt;/span&gt;&lt;br /&gt;+ &lt;span&gt;on the &lt;/span&gt;left box, double click on &lt;span style="font-style: italic;"&gt;program&lt;/span&gt;&lt;br /&gt;+ on the right box, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- name: s1 - display test waveform&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;main&lt;/span&gt; tab, fill in:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- location: &lt;span style="font-style: italic;"&gt;/usr/bin/gtkwave&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location of &lt;span style="font-style: italic;"&gt;gtkwave&lt;/span&gt; application)&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- working directory: ${workspace_loc:/s1_core}&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(&lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; working directory)&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- arguments: &lt;span style="font-style: italic;"&gt;run/sim/icarus/trace.vcd&lt;/span&gt; &lt;span style="font-style: italic;"&gt;tools/src/gtkwave.sav&lt;/span&gt;&lt;br /&gt;+ on &lt;span style="font-style: italic;"&gt;environment&lt;/span&gt; tab:&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- click on &lt;span style="font-style: italic;"&gt;new...&lt;/span&gt; button&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;___&lt;/span&gt;- fill in variable name with &lt;span style="font-style: italic;"&gt;S1_ROOT&lt;/span&gt;, value with &lt;span style="font-style: italic;"&gt;.../openSparc/s1_core&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;_____&lt;/span&gt;(location to &lt;span style="font-style: italic;"&gt;filelist.icarus&lt;/span&gt; file)&lt;br /&gt;+ click on &lt;span style="font-style: italic;"&gt;apply&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;close&lt;/span&gt; button&lt;br /&gt;+ the new setting will appear at &lt;span style="font-style: italic;"&gt;run&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;external tools&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;s1 - display test waveform&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;the default font setting form &lt;span style="font-style: italic;"&gt;eclipse&lt;/span&gt; is too big for me, so i change it from &lt;span style="font-style: italic;"&gt;window&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;preferences...&lt;/span&gt; than select &lt;span style="font-style: italic;"&gt;general&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;appearance&lt;/span&gt; - &lt;span style="font-style: italic;"&gt;colors and fonts&lt;/span&gt;. some other &lt;span style="font-style: italic;"&gt;gtk&lt;/span&gt; default font can be changed also. read this blog on how to &lt;a href="http://blog.xam.dk/archives/81-Making-Eclipse-look-good-on-Linux.html"&gt;make &lt;span style="font-style: italic;"&gt;eclipse&lt;/span&gt; look good on &lt;span style="font-style: italic;"&gt;linux&lt;/span&gt;&lt;/a&gt;.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;finding free software&lt;/li&gt;&lt;/ul&gt;i have installed &lt;a href="http://www.xilinx.com/"&gt;&lt;span style="font-style: italic;"&gt;xilinx ise webpack&lt;/span&gt; 9.2i &lt;/a&gt;on my &lt;span style="font-style: italic;"&gt;debian&lt;/span&gt; &lt;span style="font-style: italic;"&gt;gnu&lt;/span&gt;/&lt;span style="font-style: italic;"&gt;linux&lt;/span&gt; 4.0 box. i need only to install a new library and it just work. as the default, &lt;span style="font-style: italic;"&gt;debian&lt;/span&gt; comes with &lt;span style="font-style: italic;"&gt;libstdc++6&lt;/span&gt;, but &lt;span style="font-style: italic;"&gt;xilinx ise webpack 9.2i&lt;/span&gt; requires &lt;span style="font-style: italic;"&gt;libstdc++5&lt;/span&gt;. just install the old library without removing the new one.&lt;br /&gt;&lt;br /&gt;the synthesis works fine but you can't find any &lt;span style="font-style: italic;"&gt;fpga&lt;/span&gt; board that is supported by &lt;span style="font-style: italic;"&gt;xilinx ise webpack&lt;/span&gt; edition that can feed &lt;span style="font-style: italic;"&gt;opensparc t1&lt;/span&gt;or even &lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; &lt;span style="font-style: italic;"&gt;&lt;/span&gt;in. the default simulator (just a simple simulator) form &lt;span style="font-style: italic;"&gt;xilinx ise webpack&lt;/span&gt; also can't simulate &lt;span style="font-style: italic;"&gt;s1 core&lt;/span&gt; test.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_e7_RlKSyGls/R0hxXIsBgPI/AAAAAAAAAGY/Fc-gl0oxrpk/s1600-h/s1_core_ise.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://4.bp.blogspot.com/_e7_RlKSyGls/R0hxXIsBgPI/AAAAAAAAAGY/Fc-gl0oxrpk/s320/s1_core_ise.png" alt="" id="BLOGGER_PHOTO_ID_5136480017299112178" border="0" /&gt;&lt;/a&gt;&lt;span style="font-style: italic;"&gt;xilinx ise webpack on debian gnu/linux 4.0&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6029867844615683735-1669280864964517815?l=l411v.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://l411v.blogspot.com/feeds/1669280864964517815/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6029867844615683735&amp;postID=1669280864964517815' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/1669280864964517815'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6029867844615683735/posts/default/1669280864964517815'/><link rel='alternate' type='text/html' href='http://l411v.blogspot.com/2007/11/opensparc-for-hobby.html' title='opensparc for hobby (setting ide)'/><author><name>lain.ux</name><uri>http://www.blogger.com/profile/00304582461804186770</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://bp2.blogger.com/_e7_RlKSyGls/Rnz-X6ch3bI/AAAAAAAAAEg/fOG29oX5Kjs/s320/lain_ux.png'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_e7_RlKSyGls/R0hyCIsBgQI/AAAAAAAAAGg/hDIgof6ldKo/s72-c/s1_core_eclipse.png' height='72' width='72'/><thr:total>0</thr:total></entry></feed>
