terça-feira, 15 de julho de 2008

The Quest for an ObjectScript Iterator (part 4)

Here's a three key iterator that runs through a global called ^multi :



StartIterator(&i,&j,&k)
set i = ""
set j = ""
set k = ""
quit

Next(&i,&j,&k,&val)
if (i="") {
set i=$order(^multi(i))
quit $$Next(.i,.j,.k,.val)
}
if (j="") {
set j=$order(^multi(i,j))
quit $$Next(.i,.j,.k,.val)
}
set k=$order(^multi(i,j,k))
if k'="" {
set val = $get(^multi(i,j,k))
quit 1
}
set j=$order(^multi(i,j))
if j'="" {
quit $$Next(.i,.j,.k,.val)
}
set i=$order(^multi(i))
if i'="" {
quit $$Next(.i,.j,.k,.val)
}
quit 0


test()
new i,j,k,val
do StartIterator(.i,.j,.k)
while $$Next(.i,.j,.k,.val) {
write !, i_","_j_","_k_" = "_val
}
quit



As you can see $$Next() starts getting somewhat complicated. Not only long but with various recursive calls to itself. Why is this?

Well, first, compare what the ordinary nested loop version of this would look like :



order()
new i,j,k
set i=""
for {
set i=$order(^multi(i))
quit:i=""
set j=""
for {
set j=$order(^multi(i,j))
quit:j=""
set k=""
for {
set k=$order(^multi(i,j,k))
quit:k=""
write !,i_","_j_","_k_" = "_^multi(i,j,k)
}
}
}
quit



For the iterator version, we have to find a way to flatten out that nested structure of $order statements. We also have to cope with the fact that inside our $$Next we don't have any state information except the values of i, j and k. We don't know if j="" because we haven't started looping yet, or because we just reached the end of the js under the current i. (This is something we do have, for free, in the nested loop version.)

And when we do do an $order on i, we have to go round and test the j again ... etc. I'm using recursion to do these tests multiple times because it makes the code shorter.

So why would we prefer the iterator version to the nested loops? Mainly because it decouples the business logic from the details of the data-structure. test() knows nothing of the name or shape of the global. Also, the iterator is reusable many times, but the nested loops will have to be reconstructed whenever we need to run through the global.

The same principles can be applied to create iterators for globals with more keys, although as the number of keys increases, the size and complexity of the $$Next() function also increases. The pattern remains the same though.

3 comentários:

AdamL disse...

Hello, I found this blog from George James's blog. You mention the Caché function $QUERY in the second part. Are you aware this can provide the level of abstraction it seems you are after. You can use along with the indirection operator (@) to iterate through globals with an arbitrary number of subscripts?

S glb="^MYGLB"
F S glb=$QUERY(@glb) Q:glb="" W glb,"=",@glb,!

You will also find the functions $QLENGTH and $QSUBSCRIPT useful. They tell you, respectively, the number of subscripts at the current level and the value of the subscript at the level requested.

e.g. glb = ^MYGLB(1,"XYZ","asdf",5)
$QLENGTH(glb)=4
$QSUBSCSRIPT(glb,3)="asdf"

Adam

PS. From your word usage you seem to be a native English speaker. Any idea why therefore the blog controls are all in Spanish?

phil jones disse...

Hi Adam,

thanks for this. Very useful.

I knew about $QUERY but not about $QLENGTH and $QSUBSCSRIPT so my main issue with it was the awkwardness of pulling out the keys from the string representing the global descriptor.

Agreed that this is a more general solution because it won't care about the depth of the tree either.

ps : the blog controls are in Portuguese because I'm based in (and posting from) Brazil. But I don't really understand why, because my other blogs seem to be in English.

AdamL disse...

The $QSUBSCRIPT (aka $QS - I imagine you'll have come across Caché/M command abbreviation by now) function is basically a fancy $PIECE that stops you having to do the parsing of the global reference string yourself. Most global strings would be fairly simple to do, but it's possible to get commas and parentheses within subscripts so it gets more awkward.