To really get familiar with an Eclipse ODA, I decided to go through the three part "ODA Extensions and BIRT" series by Scott Rosenbaum and Jason Weathersby. It is not publicly linked but can be accessed here by doing some registration. The tutorial is spread across three parts in volume 8, 9, and 14 of the Eclipse magazine.
The end result of the ODA is the ability to use Google Spreadsheets as a data source. This is particularly appropriate for me because I am wrapping up a big networking project where we have been storing the simulation results in a Google Spreadsheet. The project involves the effects of node density on packet delivery ratio for three different wireless routing protocols.
Going through this ODA guide exposed me to Eclipse plugin and ODA concepts and terminology which will be helpful for this project. With respect to ODA, a simple but important point seems to be that of separating design and runtime for the ODA. The first part of the article mainly focuses on the runtime, the second part with the design time (GUI), and the third part with adding logging, optimization, data types, and parameters.
I think the most valuable lesson I learned was with troubleshooting an ODA. Setting up the Logger for the
ODA was crucial to solving a snag I ran into. I had the piece of the ODA working that identifies all the user's spreadsheets and then allows you to drill down and select an individual sheet for the query. However, the data preview kept failing with a CannotExecuteStatement error. Well, needless to say, this generic error didn't help identify where the root cause of the problem was. Looking into what the ODA was logging out showed it was throwing a com.google.gdata.util.ServiceException exception. A few alterations to the ODA and I could see the entire stack indicating it was throwing a com.google.gdata.util.InvalidEntryException exception (extends ServiceException) which typically indicates a bad or malformed request. The final step was to log out the actual query that was being made and I could see that the problem was that the filterClause data set parameter was defaulting to 'dummy default value'. Changing this to null allowed the data preview to work. See the image above that shows the node density and delivery ratio data from my project spreadsheet.
The next thing I am going to do is get, build, and become familiar with the OpenMRS ODA. I've also started adding links that are useful to this project to the right navigation area of this blog.
I had another discussion last Tuesday with Shaun Grannis and James Egg and we think the discussion went really well. This time I didn’t ask too much silly questions hehe … We clarify some more on what we want to do with my first project. There are couple of issues that we focus on, such as how to to propagate the u values from the random sampling result to the EM analysis process.
After doing some digging, I found out that the u value is saved in the MatchingConfigRow object in the non-agreement property. At the end of the random sampling calculation, this non-agreement property will be assigned with the result of the calculation. Now we already have the u values from the random sampling process. But how do we propagate this u value to the EM analysis process. Dig some more then …
Well, apparently the EM analysis also take MatchingConfig object as the parameter which contains all above MatchingConfigRow. So, now we need to tell the EM analysis process to use this value when the user want to pick to use random sampling. We need to put a switch then to let know the EM analysis which value to be used, some default value or the values from the random sampling process.
Another thing that we discuss in the phone was connecting this process to the Record Linker GUI. Arghhh, I’m not good at GUI programming. I just don’t have the sense of arts to create a good GUI. But, I have to give it a shot hehe …
Some term explanation:
- Record Linker is the name of the program that I will work on. One of the capability of the program is to combine records from different sources using statistical analysis on those records.
MatchingConfigis an object that will store the parameter that will be used for analyzing those records. There are lots of parameters that need to be define, for example where to get the records, what fields can be found in the records etcMatchingConfigRowis an object that will store the options to match each column in the records. These parameters for example, the algorithm that will be used for the matching process.MatchingConfigobject contains series ofMatchingConfigRowdenoting that a single records will contains many columns in it.- The random sampling and EM analyzing process will take this
MatchingConfigobject as their process parameters. ThisMatchingConfigwill be shared by the two process to propagate the result from random sampling to EM analyzing.
Some fact that I learn:
- When the records are coming from file, there are a few step that need to be done before the file can be analyzed. The file are chopped to only include fields that will be used in the analysis process. After the file is chopped, the file is sorted using the operating system built-in sort function on the blocking fields.
- Let’s keep some fact for the upcoming posts hehe …
Any question? I hope I didn’t miss anything …

I've noticed something while reading my daily blogs, a lot of code is just unreadable because most blog systems (blogger looking at you), screw up indentation, unless you wrap it in a pre tag (opening and closing are both required. This makes it readable for your readers! I've left comments on the blogs that didn't know this, and now they do.
This message is primarily for the Google Summer of Code students, but is useful to the programming community as a whole. When you post code, wrap it in a pre tag and be sure to close them when your code example is complete.
I finally have a little breathing room from my huge school project and wanted to setup OpenMRS on my laptop. I basically followed the steps from http://openmrs.org/wiki/Step-by-Step_Installation_for_Developers and got things up and running. Basic stuff about the versions of the various software I setup:
- Fresh install of Eclipse 3.3.2 and setup Subclipse 1.2.4. Running Java 1.6.0_05-b13 but set compliance level to 1.5.
- Installed MySQL 5.0.51b and configured as a multifunctional db with 20 concurrent connections. Decided to make UTF8 the default character set.
- Installed the MySQL GUI tools so I could easily look at the db model if required.
- ant 1.7.0
- Tomcat 6.0.16
To make sure things were all working correctly, I made my self a patient in the system:

Alright, back to running network simulations for school. Only 10 more days!
Thanks to the generous folks at Review Board, OpenMRS has the opportunity to try out Review Board for performing code reviews for our GSoC projects. Students can download post -review (a python script) and submit changes for review simply by issuing the command “post-review” within the root of their local copy. Mentors can review the [...]
Well after some work and crunching I have managed to get the new UI Idea up onto the live link (http://www.forxdesign.co.za/ox-designer/). Due to some recent changes in the GWT-dnd library there seems to be the possibility to removet the bottom "pallet" where the widgets are dropped on creation. I will be looking at changing this as well as refining the click-properties panel of each widget. I will either be adding an "edit" link or a context menu tag (right click - my preference)
I have been accepted in Google Summer of Code 2008 program for working in a project of OpenMRS. This summer I will be working to create a Patient Note Writer module which will allow to add rich text observations in appropriate places/pages of the OpenMRS application. OpenMRS is an open source medical record system framework for developing countries. Its always nice to get accepted in programs like summer of code and its more nice to work for an Organization with a great cause. Another coincidence is that by profession I also work in software development related to the health care domain. So I am expecting an exciting summer with OpenMRS. I am also hoping to see OpenMRS deployed in Bangladesh in the future.
My first phone discussion about my project with my mentor, Shaun Grannis and James Egg, went well. Shaun and James explain to me about the project in details and I think the project is really interesting. I made a couple of stupid questions that is not related to the project though, sorry for that Shaun and James hehe …
My first project is to implement a fully functional random sample analyzer that calculates the rate of random agreement among corresponding pairs of records between two data sources. This rate value will replace the u rate, field agreement rate among pairs that are truly non-matched, that come from the Expectation Maximization analyzer. To get a better overview about linkage process and rationale behind the process you should read this publication about record linkage. If you want to know more about the Expectation Maximization algorithm you can read the wiki or some other journals and publication.
The process for generating u value for each column are as follows:
- Generate two arrays of
Recordwith the desired size of maximum sampling size - Take one
Recordfrom each array at a time and do the following:- For each demographic data in the
Record, match their value using selectedStringmatching algorithm (Jaro-Winkler, Levenshtein, Longest Common Substring or Exact Match) - If the value from both
Recordmatch each other, then increment match rate of current demographic data.
- For each demographic data in the
- Do over above process until all record have been paired and examined
- Calculate the u value for each demographic data and set the new u value to the
MatchConfigobject.
I still need to dig more about the first process and see how each datasource is read and converted into Record object. What do you think about the above process? Did I miss anything?

I got a very exciting invitation yesterday (actually discovered it in my SPAM folder… eek!). There’s a yearly event (now in it’s third year), called Science Foo Camp. In short, it’s a multidisciplinary unconference focused on innovation in science and technology. Doing a Connotea search on scifoo gives you a whole host of chatter about the first two meetings, some of which has me in a lather. I’m particularly excited about this opportunity, because it might present me with the forum to talk about some of the deeper challenges with OpenMRS and serving the underserved as we grow our community. Perhaps I’ll also have a forum to talk about the challenges related to clinical informatics in general: data normalization and aggregation, community-wide decision support, using clinical data retrospectively for pharmacovigilance and quality improvement, and evolving from clinic-centric records into personal health records, etc.
Maybe, just maybe, someone will be at this conference who will provide us with the activation energy to do some of the really cool projects we’ve all dreamed of doing.

Welcome everybody to my GSoC OpenMRS project Blog.

Really , my application was accepted and I am going to code this summer for the GSoC OpenMRS project . I am happy to work on this because the OpenMrs Api is done in Java which is my preferred programming language. Before I apply , I got information that there was a kind of software implemented by PIH that was going to be deployed in the clinics across my country . I found it was openMRS and I tried to make my application through GoSoc .When the results were announced on the Google Summer of Code web page, I was much happy to experience my first open Source project . During the contacts I made with my mentors , Maros and Christian ,I explained them what I was able to do with Java,I gave them a link to an applet in which I implemented an old African game “Igisoro” or “Omweso” at http://www.nzeyi.890m.com/nzeyiIgisoro.php . I was assigned to implement the OpenMRS Synchronization Admin UI . I felt the project was much challenging for me because synchronization was the newest feature in the system , so I have to read more on the former implementation.The main problem I faced was download the source code with Subversion because I was behind proxy settings inside the campus . I thought subversion doesn’t support proxy settings, so i also tried Subclipse and TornoiseSVN but the results were the same .Finally I downloaded the trunk code on an office computer where connection is through a DHCP server.The connection was too slow so that it take me the hall night to get the openmrs/trunk code .I followed the installation instructions but there was a 404 error that my Tomcat couldn’t find openmrs while my deployment with ant was done successfully.That was an other major problem but I resolved it a little bit later using the ant commands ant -k clean and ant update .Until now , there is almost any other error with my installation. By the next step , I am going to read more about the code, I am reading the Javadoc of the core api and I need another about the Synchronization (if there is any) because I want to get more familiar with the original designs of the system.During the next 3 weeks I have sem 1 exams so I should be a little bit busy , but I will continue to plan for my project. Thanks !









