A Django site.
July 29, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» Another week, another (minor) change of plans.. :)


Time for another update. Business more or less as usual, conference calls 2 times a week at least, and some chatting when needed. In general I feel it’s going a bit better now, and we’re actually implementing something that seems promising as a start. Spent some time last week implementing example serialization of both POJOs and the same objects served through Hibernate with CGLIB proxies to see which problems we would get with the different solutions for serialization. We took a look at a third option too, Simple which seemed to work quite well - except for lack of one requirement we had - some way of supporting several levels of serialization, e.g. deep and shallow. After a short discussion on Friday we decided to go with something based on Julie’s old code, e.g. a homegrown solution, and then rather change later if it doesn’t work out. The main disadvantages with that is that compared to some other solutions it’ll require more code to keep updated - and we won’t have built-in support for references and circular references. Most likely we’ll be able to avoid most pitfalls involving that though the way we plan to implement the serialization.

The second part I was going to work on last week, the REST based web service/transport layer is put on hold for now, we’ve got more important stuff to finish first, as we need *something* to work ASAP so that OpenMRS can be of use to a certain implementation somewhere in the world. Their requirements isn’t too complex, basically anything that can serialize the objects and merge several servers into one will be of great use. For the next week or two we aim at just having something to help out with that. The transport layer will be handled by a USB-stick so bandwidth and data size isn’t of much importance at this time. Committed a couple of minor updates/changes to the REST module only for now.

Current tasks involves making a journal/queue system for synchronization records, storing and keeping track state and data. Currently I’m implementing that as just another DAO/Service in the system. As soon as that is starting to look like something usable, I’ll also take a look at adding a screen to the Administration section of OpenMRS to track this visually..

July 16, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» Midterm update


It’s (long overdue) for an update, we’re already halfway through the summer, and there’s a lot to do. The synchronization project has turned out a bit different than I expected before I started on this project, more people involved and a different way of working than what I’m used to, basically more planning and coordination. I guess that’s mainly a good thing, but it tends to make me a bit more passive than I should be.

A lot of discussions has been going about how to handle the synchronization, what to include, what’s needed, what’s needed if we do it like this and so on. History and auditing has been a big part of this lately, and that task is now being worked on by two of the other developers. Personally I’d prefer seeing the full implementation of that on all the tables, keeping track of all changes and making figuring out the changes between the servers easier since we have the new and the old data available. Currently this is being implemented fully on two tables to see if it works out as planned, and if it’s feasible to implement for the rest too.

The project and the tasks around is now split up as described on the project plan page.

I’ve been looking at SyncML and JAXB, but get some errors that I at the time of writing haven’t figured out yet. Basically coding up a few POJOs to keep the information needed in the XML-format, and then trying to use JAXB to bind those to XML. I haven’t touched JAXB before, and so far I must say I prefer XStream :) Better stick with what’s already present though.

The other thing I’ve been trying to get working is a Web Service interface for the synchronization. There’s already another branch that has implemented that, and I’ve been looking around the code there. For some reason the branch doesn’t want to run on my computer though, but it’s time for a revisit to that part now. Would be nice if it was merged into the alpha-trunk so that there wasn’t a need to duplicating much of the work already done there.

My first try on merging those two mentioned parts and coding them at the same time failed of course, it just got too messy since I haven’t used any of those technologies before with Java. I’ve since split it up into separate non-OpenMRS projects just to figure it out, and then I’ll move it back in when I get it working properly.

Should get going with the coding again I guess, lost too much time to other things lately.

June 26, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» More timestamps


Ok then, time for another update. Back in Oslo for a few weeks, bad weather and raining currently, should be good for my project work I guess.

I’ve read a bit about Lamport timestamps as I was asked to do, and concluded so far that in our setting, they’re not useful unfortunately. The reason for that is that it works by increasing the internal clock (counter) for every event in the process, e.g. change of a record, or addition of a new one. Messages to other processes contains the local clock value. In the cases where it’s a message from another process, and that clock is ahead of the local one, it fast-forwards its own clock to the other processes’ clock+1 and continues from there. There’s always at least one clock tick between events.

Since our system will only connect with the other process (server) when synchronizing, this is, as far as I can see, not very useful. With Lamports method, if x and y happens in different processes and that does not exchange messages, you can’t say x -> y, and y -> x. Actually nothing can be said about when the events happened or which event happened first. They are because of that regarded as concurrent events.

As far as I see currently, the best solution we have available at the moment is the one Julie suggested a while back. We keep track of changes on the different servers using the local clock. When we synchronize up to the parent, we try to figure out how much the clock has drifted since last synchronization, and then apply the drift-time to the records changed. Certainly not a perfect solution, with many pitfalls. I’m actually unsure at the current point in time if it’s worth it doing this drift-correction at all, since that means we’ll have to struggle with updating the child’s records too with the drift time to make sure we have identical records both places. Of course, the drift calculations could be taken care of as part of the synchronization communications protocol, figuring out the drift time before we actually capture and serialize the identified changes to the server. Also - who knows - which clock is drifting “in the wrong direction”? The server might be the one that had problems with its clock for all we know. I guess the theory is that the servers further up in the hierarchy are handled with more care and better hardware/internet-connections so that there are fewer problems and more frequent adjustments done against official timeservers. And fewer people changing time(-zone) settings just for the fun of it, giving us even internally inconsistent data.. I guess that’s a case where we could actually gain some advantage from a logical clock system, and I guess we’ll probably have that in reality with the planned history-tables. Ugh, there’s so many potential problems if you start digging… :)

If I’m wrong anywhere, I’d be very happy to receive some feedback either here or on mail - I’m sure I’ve missed a point somewhere..

June 19, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» Travel & timesync


Okay, time for a new update I guess. Lots of stuff happening the last couple of weeks, though much (most) of it not GSoC related unfortunately. Skip this paragraph if you don’t care about that. I’ll begin with me being back in Oslo, Norway (actually my hometown Molde at the time of writing - at a family visit for a few days) after leaving Vietnam the 13th. Terrible day really, I hope to be able to go back in not too long, Vietnam really is a great country in many ways. Anyway; I wish I could say the trip went uneventful but that would be a understatement. After leaving Vietnam my first flight left for Kuala Lumpur, Malaysia where I stayed a few hours in Chinatown waiting for my early morning flight. Somehow I managed to get sick, throwing up+++ and getting fairly dehydrated - the perfect thing to happen before a 12h flight, right? Found a couple of medicines at the pharmacy at airport and ate both types of pills just to be sure.. they came up again fairly quickly of course like everything else. I’ll skip the details. When I arrived in Sweden I had to do a re-check-in, and I guess I looked fairly suspicious so the customs guys started asking me questions - first time ever. No problems though. After waiting for a delayed flight to Oslo, the same thing happened there(!). I’ve flown quite a few times abroad and never been stopped before, but now I guess I know what those customs guys look for - unshaved guys, red eyes, pale, sweaty skin with lots of baggage. Again, no problems. So, I’m back, on a much faster and more stable internet connection, and I even have reliable electricity!

So, something more related to GSoC; both before and after this we’ve had several conference calls setup for the group of people involved in the synchronization efforts, with general questions, discussions about how we imagine the solutions, what to do some more research on, who’s doing what etc. I’ll admit that for me it’s a little bit scary, and I’ve probably been to passive until now, especially as the scale of the solution seems to increase quite a bit (and so has the group of developers involved). Nation-wide synchronizations has been touched, but there’s still a lot to look into. My mentor Maros relaxed me though, we’ll tighten the scope in a while, but let the discussions drift a bit for now - even some thoughts about patient identification using biometric information like fingerprints with scanners, and how that would work with millions of patients - many of them performing manual, hard work wearing down their fingertips.

It feels a bit “scary” for me too sometimes, I realize that the other guys and girls have a lot of experience compared to me, there’s no doubt that I’ve got a lot to learn from all of them. Also, I’ve always felt it a bit hard to discuss orally with native-speaking English people in English, it always makes me a bit stressed - at least in the beginning. I’m sure that’s a thing that will change as I get to know them and the project better.

During the last conference call I got the task of looking into time synchronization of the different events/server-instances. We’ll probably need that to make sure we know which record is the newest one in a reliable way, in most cases it won’t be a problem, but - clocks drifts and all kinds of bad stuff can happen. Maros pointed out some information on Wikipedia about Lamport Timestamps, so I’m currently reading the article about that before our next call tomorrow to see if that’s useful. There’s something similar called Vector clocks too that I might have to check out.. and maybe something completely different. This is something I haven’t touched and thought much about until now, we’ll see how it turns out and what we end up with in the next few days.

June 5, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» Week 1


Finally I too have gotten around to submitting the foreign student certification form and the required proof of enrollment as a student. Wasn’t as hard as I expected, quite painless actually, the program administrators seem to do a very good job :)

The last week has been a bit slow for me, electricity disappeared one or two times, internet disconnects several times a day - and even Skype gave me problems this week. Usually Skype calls give me no problem here, but I was unable to stay in a conference call with my mentor Maros, another intern that will work on the project and a third developer/mentor. After a few tries we just gave up - really a shame as it would probably have been useful for me too to listen to the conversation about the project. Also there’s quite a few things I need to fix before I leave Vietnam, buying some of those “required” gifts forcing me to pay even more for overweight luggage. Then trying to avoid some of that by finding a solution for sending stuff back home without having to either wait 3 months, or pay as much as the overweight would cost me.. and so on.

I got my development setup early in the week without any big problems. Many of the prerequisites was already present as I’m working on another project that use many of the same tools. I actually think the setup is easier on OS X than in many of the Windows/Linux cases, I’m certainly happy with the Mac as a development machine. Updated the Ant 1.6.5 that was installed with XCode to Ant 1.7. I installed Tomcat, as I usually use Jetty for development testing, run from Maven. Earlier I also updated Subversion to the latest version and compiled it with SSL support as I was having trouble with the initial code checkout in the beginning. It worked afterwards, but I’m not convinced that the SSL support actually was the key.. as normal http checkout too worked fine when I re-tried that.. anyway, it works now.

I’ve had a look around the code trying to figure out how this project works compared to DHIS. Think I’ve got the very basics now, been a while since I touched JSPs. Also been looking over some key classes in the OpenMRS API. My database is now populated with the demo data that was just made available for version 1.1 yesterday, meaning I’ve got a fairly large data set to test synchronization on I guess with the 476000+ observations and 5200+ patients that was imported. Actually, it’s not big at all, but it should be sufficient to test synchronization for now at least.

As for the actual project I’m going to work on it’s synchronization of the OpenMRS data between different servers/installations. It’s fairly complex if all possible combinations of servers/clients, sync-cases etc. shall be handled properly. I’m supposed to implement 3 parts of the system this summer, and we’re going to leave some parts out, e.g. some of the advanced conflict handling, though provide hooks so that this can be implemented fairly easy when that time comes. The first part, which actually won’t be implemented first, is the generation of a changeset - the data changed since last synchronization. I have a feeling that that part might not be as straight forward as it sounds, since not all of the data is timestamped currently. Another potential problem would be the time on the different servers out of sync, which might leave us with some “lost” records. One option would be to implement some kind of changelog and base the changeset on that, then clear it when a synchronization is confirmed from the receiver. Another option would be to define a main server, and use the time/time offset from that somehow when generating the changeset - but then we’d need timestamps everywhere again, we’ll see what we end up with I guess.

Until now, XML has been pretty much the only mentioned data format, and quite naturally so. Unfortunately - it does take up a lot more bytes than e.g. plain CSV. In the DHIS project we built in compression using Javas Zip-classes. Since most of the XML-specific stuff is repeated over and over again, the compression results was quite good. I guess that in OpenMRS too, some kind of built in compression in the transportation layer would be interesting to speed things up on the often slow connections in the field. I expect the OpenMRS’ implementation to face similar limitations as DHIS in that regard.

Task one will be to select and implement a data model serialization mechanism. I’ll have to take a look at the different options for generating XML again, with JAXB as a potential solution. So far I’ve been working with XStream which is great software with a good mailing list - but unfortunately I’ve run into problems with XStream and Hibernate due to CGLib proxies earlier, and I guess that will be a problem in this project too. I really like the Converters from XStream rather than annotations based implementations.. gives a lot more flexibility when it comes to multi-format serialization. Might not need that in OpenMRS though. Another task is figuring out in what order the data have to be committed in to maintain referential integrity, unless this can be handled by the framework automatically in some way during serialization so that de-serialization automatically builds a object hierarchy in the right order. XStream solves this with references in the XML, and child classes wrapped in parent class XML, and then it rebuilds the object hierarchy from the leaf nodes and up. The mentioned problem with changeset detection and lack of timestamps needs to be addressed over the next few weeks too.

I’m most worried about the last part really, importing the data and handling conflicts isn’t straight forward. I understand that there might be some related functionality in OpenMRS already though with the dictionary import. Haven’t taken a look at that yet, but I guess it’s on my list.

Looking forward to actually writing some code soon, but first - some more research I guess.

May 24, 2007

Anders Gjendem
Personal Blog
Who needs a title? is about »
» Intro

Finally I had to give in for the pressure and create a blog too, as it’s a requirement for this summers OpenMRS-GSoC project I’m going to be part of. I’m not complaining though, and .. I guess it was about time I got hip & cool blogging too, I’ll admit that I’ve been considering it for some time, and this was just the final push. :)

I expect that I’ll write mostly about the OpenMRS project I will work on - namely the Synchronization Lite between installations of OpenMRS. On paper it seems quite similar as to what I’ve been spending some time on on the DHIS 2.0 project I’m also part of - though in this case I’ll probably look more into frameworks for doing the job - rather than rewriting an existing problematic solution step by step as was/is the case in DHIS. Hopefully both projects will benefit from this and my coming knowledge about OpenMRS in the future. We hope to integrate DHIS and OpenMRS in the not too distant future.