1 | =head1 NAME |
---|
2 | |
---|
3 | perldsc - Perl Data Structures Cookbook |
---|
4 | |
---|
5 | =head1 DESCRIPTION |
---|
6 | |
---|
7 | The single feature most sorely lacking in the Perl programming language |
---|
8 | prior to its 5.0 release was complex data structures. Even without direct |
---|
9 | language support, some valiant programmers did manage to emulate them, but |
---|
10 | it was hard work and not for the faint of heart. You could occasionally |
---|
11 | get away with the C<$m{$AoA,$b}> notation borrowed from B<awk> in which the |
---|
12 | keys are actually more like a single concatenated string C<"$AoA$b">, but |
---|
13 | traversal and sorting were difficult. More desperate programmers even |
---|
14 | hacked Perl's internal symbol table directly, a strategy that proved hard |
---|
15 | to develop and maintain--to put it mildly. |
---|
16 | |
---|
17 | The 5.0 release of Perl let us have complex data structures. You |
---|
18 | may now write something like this and all of a sudden, you'd have a array |
---|
19 | with three dimensions! |
---|
20 | |
---|
21 | for $x (1 .. 10) { |
---|
22 | for $y (1 .. 10) { |
---|
23 | for $z (1 .. 10) { |
---|
24 | $AoA[$x][$y][$z] = |
---|
25 | $x ** $y + $z; |
---|
26 | } |
---|
27 | } |
---|
28 | } |
---|
29 | |
---|
30 | Alas, however simple this may appear, underneath it's a much more |
---|
31 | elaborate construct than meets the eye! |
---|
32 | |
---|
33 | How do you print it out? Why can't you say just C<print @AoA>? How do |
---|
34 | you sort it? How can you pass it to a function or get one of these back |
---|
35 | from a function? Is is an object? Can you save it to disk to read |
---|
36 | back later? How do you access whole rows or columns of that matrix? Do |
---|
37 | all the values have to be numeric? |
---|
38 | |
---|
39 | As you see, it's quite easy to become confused. While some small portion |
---|
40 | of the blame for this can be attributed to the reference-based |
---|
41 | implementation, it's really more due to a lack of existing documentation with |
---|
42 | examples designed for the beginner. |
---|
43 | |
---|
44 | This document is meant to be a detailed but understandable treatment of the |
---|
45 | many different sorts of data structures you might want to develop. It |
---|
46 | should also serve as a cookbook of examples. That way, when you need to |
---|
47 | create one of these complex data structures, you can just pinch, pilfer, or |
---|
48 | purloin a drop-in example from here. |
---|
49 | |
---|
50 | Let's look at each of these possible constructs in detail. There are separate |
---|
51 | sections on each of the following: |
---|
52 | |
---|
53 | =over 5 |
---|
54 | |
---|
55 | =item * arrays of arrays |
---|
56 | |
---|
57 | =item * hashes of arrays |
---|
58 | |
---|
59 | =item * arrays of hashes |
---|
60 | |
---|
61 | =item * hashes of hashes |
---|
62 | |
---|
63 | =item * more elaborate constructs |
---|
64 | |
---|
65 | =back |
---|
66 | |
---|
67 | But for now, let's look at general issues common to all |
---|
68 | these types of data structures. |
---|
69 | |
---|
70 | =head1 REFERENCES |
---|
71 | |
---|
72 | The most important thing to understand about all data structures in Perl |
---|
73 | -- including multidimensional arrays--is that even though they might |
---|
74 | appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally |
---|
75 | one-dimensional. They can hold only scalar values (meaning a string, |
---|
76 | number, or a reference). They cannot directly contain other arrays or |
---|
77 | hashes, but instead contain I<references> to other arrays or hashes. |
---|
78 | |
---|
79 | You can't use a reference to a array or hash in quite the same way that you |
---|
80 | would a real array or hash. For C or C++ programmers unused to |
---|
81 | distinguishing between arrays and pointers to the same, this can be |
---|
82 | confusing. If so, just think of it as the difference between a structure |
---|
83 | and a pointer to a structure. |
---|
84 | |
---|
85 | You can (and should) read more about references in the perlref(1) man |
---|
86 | page. Briefly, references are rather like pointers that know what they |
---|
87 | point to. (Objects are also a kind of reference, but we won't be needing |
---|
88 | them right away--if ever.) This means that when you have something which |
---|
89 | looks to you like an access to a two-or-more-dimensional array and/or hash, |
---|
90 | what's really going on is that the base type is |
---|
91 | merely a one-dimensional entity that contains references to the next |
---|
92 | level. It's just that you can I<use> it as though it were a |
---|
93 | two-dimensional one. This is actually the way almost all C |
---|
94 | multidimensional arrays work as well. |
---|
95 | |
---|
96 | $array[7][12] # array of arrays |
---|
97 | $array[7]{string} # array of hashes |
---|
98 | $hash{string}[7] # hash of arrays |
---|
99 | $hash{string}{'another string'} # hash of hashes |
---|
100 | |
---|
101 | Now, because the top level contains only references, if you try to print |
---|
102 | out your array in with a simple print() function, you'll get something |
---|
103 | that doesn't look very nice, like this: |
---|
104 | |
---|
105 | @AoA = ( [2, 3], [4, 5, 7], [0] ); |
---|
106 | print $AoA[1][2]; |
---|
107 | 7 |
---|
108 | print @AoA; |
---|
109 | ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0) |
---|
110 | |
---|
111 | |
---|
112 | That's because Perl doesn't (ever) implicitly dereference your variables. |
---|
113 | If you want to get at the thing a reference is referring to, then you have |
---|
114 | to do this yourself using either prefix typing indicators, like |
---|
115 | C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows, |
---|
116 | like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>. |
---|
117 | |
---|
118 | =head1 COMMON MISTAKES |
---|
119 | |
---|
120 | The two most common mistakes made in constructing something like |
---|
121 | an array of arrays is either accidentally counting the number of |
---|
122 | elements or else taking a reference to the same memory location |
---|
123 | repeatedly. Here's the case where you just get the count instead |
---|
124 | of a nested array: |
---|
125 | |
---|
126 | for $i (1..10) { |
---|
127 | @array = somefunc($i); |
---|
128 | $AoA[$i] = @array; # WRONG! |
---|
129 | } |
---|
130 | |
---|
131 | That's just the simple case of assigning an array to a scalar and getting |
---|
132 | its element count. If that's what you really and truly want, then you |
---|
133 | might do well to consider being a tad more explicit about it, like this: |
---|
134 | |
---|
135 | for $i (1..10) { |
---|
136 | @array = somefunc($i); |
---|
137 | $counts[$i] = scalar @array; |
---|
138 | } |
---|
139 | |
---|
140 | Here's the case of taking a reference to the same memory location |
---|
141 | again and again: |
---|
142 | |
---|
143 | for $i (1..10) { |
---|
144 | @array = somefunc($i); |
---|
145 | $AoA[$i] = \@array; # WRONG! |
---|
146 | } |
---|
147 | |
---|
148 | So, what's the big problem with that? It looks right, doesn't it? |
---|
149 | After all, I just told you that you need an array of references, so by |
---|
150 | golly, you've made me one! |
---|
151 | |
---|
152 | Unfortunately, while this is true, it's still broken. All the references |
---|
153 | in @AoA refer to the I<very same place>, and they will therefore all hold |
---|
154 | whatever was last in @array! It's similar to the problem demonstrated in |
---|
155 | the following C program: |
---|
156 | |
---|
157 | #include <pwd.h> |
---|
158 | main() { |
---|
159 | struct passwd *getpwnam(), *rp, *dp; |
---|
160 | rp = getpwnam("root"); |
---|
161 | dp = getpwnam("daemon"); |
---|
162 | |
---|
163 | printf("daemon name is %s\nroot name is %s\n", |
---|
164 | dp->pw_name, rp->pw_name); |
---|
165 | } |
---|
166 | |
---|
167 | Which will print |
---|
168 | |
---|
169 | daemon name is daemon |
---|
170 | root name is daemon |
---|
171 | |
---|
172 | The problem is that both C<rp> and C<dp> are pointers to the same location |
---|
173 | in memory! In C, you'd have to remember to malloc() yourself some new |
---|
174 | memory. In Perl, you'll want to use the array constructor C<[]> or the |
---|
175 | hash constructor C<{}> instead. Here's the right way to do the preceding |
---|
176 | broken code fragments: |
---|
177 | |
---|
178 | for $i (1..10) { |
---|
179 | @array = somefunc($i); |
---|
180 | $AoA[$i] = [ @array ]; |
---|
181 | } |
---|
182 | |
---|
183 | The square brackets make a reference to a new array with a I<copy> |
---|
184 | of what's in @array at the time of the assignment. This is what |
---|
185 | you want. |
---|
186 | |
---|
187 | Note that this will produce something similar, but it's |
---|
188 | much harder to read: |
---|
189 | |
---|
190 | for $i (1..10) { |
---|
191 | @array = 0 .. $i; |
---|
192 | @{$AoA[$i]} = @array; |
---|
193 | } |
---|
194 | |
---|
195 | Is it the same? Well, maybe so--and maybe not. The subtle difference |
---|
196 | is that when you assign something in square brackets, you know for sure |
---|
197 | it's always a brand new reference with a new I<copy> of the data. |
---|
198 | Something else could be going on in this new case with the C<@{$AoA[$i]}}> |
---|
199 | dereference on the left-hand-side of the assignment. It all depends on |
---|
200 | whether C<$AoA[$i]> had been undefined to start with, or whether it |
---|
201 | already contained a reference. If you had already populated @AoA with |
---|
202 | references, as in |
---|
203 | |
---|
204 | $AoA[3] = \@another_array; |
---|
205 | |
---|
206 | Then the assignment with the indirection on the left-hand-side would |
---|
207 | use the existing reference that was already there: |
---|
208 | |
---|
209 | @{$AoA[3]} = @array; |
---|
210 | |
---|
211 | Of course, this I<would> have the "interesting" effect of clobbering |
---|
212 | @another_array. (Have you ever noticed how when a programmer says |
---|
213 | something is "interesting", that rather than meaning "intriguing", |
---|
214 | they're disturbingly more apt to mean that it's "annoying", |
---|
215 | "difficult", or both? :-) |
---|
216 | |
---|
217 | So just remember always to use the array or hash constructors with C<[]> |
---|
218 | or C<{}>, and you'll be fine, although it's not always optimally |
---|
219 | efficient. |
---|
220 | |
---|
221 | Surprisingly, the following dangerous-looking construct will |
---|
222 | actually work out fine: |
---|
223 | |
---|
224 | for $i (1..10) { |
---|
225 | my @array = somefunc($i); |
---|
226 | $AoA[$i] = \@array; |
---|
227 | } |
---|
228 | |
---|
229 | That's because my() is more of a run-time statement than it is a |
---|
230 | compile-time declaration I<per se>. This means that the my() variable is |
---|
231 | remade afresh each time through the loop. So even though it I<looks> as |
---|
232 | though you stored the same variable reference each time, you actually did |
---|
233 | not! This is a subtle distinction that can produce more efficient code at |
---|
234 | the risk of misleading all but the most experienced of programmers. So I |
---|
235 | usually advise against teaching it to beginners. In fact, except for |
---|
236 | passing arguments to functions, I seldom like to see the gimme-a-reference |
---|
237 | operator (backslash) used much at all in code. Instead, I advise |
---|
238 | beginners that they (and most of the rest of us) should try to use the |
---|
239 | much more easily understood constructors C<[]> and C<{}> instead of |
---|
240 | relying upon lexical (or dynamic) scoping and hidden reference-counting to |
---|
241 | do the right thing behind the scenes. |
---|
242 | |
---|
243 | In summary: |
---|
244 | |
---|
245 | $AoA[$i] = [ @array ]; # usually best |
---|
246 | $AoA[$i] = \@array; # perilous; just how my() was that array? |
---|
247 | @{ $AoA[$i] } = @array; # way too tricky for most programmers |
---|
248 | |
---|
249 | |
---|
250 | =head1 CAVEAT ON PRECEDENCE |
---|
251 | |
---|
252 | Speaking of things like C<@{$AoA[$i]}>, the following are actually the |
---|
253 | same thing: |
---|
254 | |
---|
255 | $aref->[2][2] # clear |
---|
256 | $$aref[2][2] # confusing |
---|
257 | |
---|
258 | That's because Perl's precedence rules on its five prefix dereferencers |
---|
259 | (which look like someone swearing: C<$ @ * % &>) make them bind more |
---|
260 | tightly than the postfix subscripting brackets or braces! This will no |
---|
261 | doubt come as a great shock to the C or C++ programmer, who is quite |
---|
262 | accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th> |
---|
263 | element of C<a>. That is, they first take the subscript, and only then |
---|
264 | dereference the thing at that subscript. That's fine in C, but this isn't C. |
---|
265 | |
---|
266 | The seemingly equivalent construct in Perl, C<$$aref[$i]> first does |
---|
267 | the deref of $aref, making it take $aref as a reference to an |
---|
268 | array, and then dereference that, and finally tell you the I<i'th> value |
---|
269 | of the array pointed to by $AoA. If you wanted the C notion, you'd have to |
---|
270 | write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first |
---|
271 | before the leading C<$> dereferencer. |
---|
272 | |
---|
273 | =head1 WHY YOU SHOULD ALWAYS C<use strict> |
---|
274 | |
---|
275 | If this is starting to sound scarier than it's worth, relax. Perl has |
---|
276 | some features to help you avoid its most common pitfalls. The best |
---|
277 | way to avoid getting confused is to start every program like this: |
---|
278 | |
---|
279 | #!/usr/bin/perl -w |
---|
280 | use strict; |
---|
281 | |
---|
282 | This way, you'll be forced to declare all your variables with my() and |
---|
283 | also disallow accidental "symbolic dereferencing". Therefore if you'd done |
---|
284 | this: |
---|
285 | |
---|
286 | my $aref = [ |
---|
287 | [ "fred", "barney", "pebbles", "bambam", "dino", ], |
---|
288 | [ "homer", "bart", "marge", "maggie", ], |
---|
289 | [ "george", "jane", "elroy", "judy", ], |
---|
290 | ]; |
---|
291 | |
---|
292 | print $aref[2][2]; |
---|
293 | |
---|
294 | The compiler would immediately flag that as an error I<at compile time>, |
---|
295 | because you were accidentally accessing C<@aref>, an undeclared |
---|
296 | variable, and it would thereby remind you to write instead: |
---|
297 | |
---|
298 | print $aref->[2][2] |
---|
299 | |
---|
300 | =head1 DEBUGGING |
---|
301 | |
---|
302 | Before version 5.002, the standard Perl debugger didn't do a very nice job of |
---|
303 | printing out complex data structures. With 5.002 or above, the |
---|
304 | debugger includes several new features, including command line editing as |
---|
305 | well as the C<x> command to dump out complex data structures. For |
---|
306 | example, given the assignment to $AoA above, here's the debugger output: |
---|
307 | |
---|
308 | DB<1> x $AoA |
---|
309 | $AoA = ARRAY(0x13b5a0) |
---|
310 | 0 ARRAY(0x1f0a24) |
---|
311 | 0 'fred' |
---|
312 | 1 'barney' |
---|
313 | 2 'pebbles' |
---|
314 | 3 'bambam' |
---|
315 | 4 'dino' |
---|
316 | 1 ARRAY(0x13b558) |
---|
317 | 0 'homer' |
---|
318 | 1 'bart' |
---|
319 | 2 'marge' |
---|
320 | 3 'maggie' |
---|
321 | 2 ARRAY(0x13b540) |
---|
322 | 0 'george' |
---|
323 | 1 'jane' |
---|
324 | 2 'elroy' |
---|
325 | 3 'judy' |
---|
326 | |
---|
327 | =head1 CODE EXAMPLES |
---|
328 | |
---|
329 | Presented with little comment (these will get their own manpages someday) |
---|
330 | here are short code examples illustrating access of various |
---|
331 | types of data structures. |
---|
332 | |
---|
333 | =head1 ARRAYS OF ARRAYS |
---|
334 | |
---|
335 | =head2 Declaration of a ARRAY OF ARRAYS |
---|
336 | |
---|
337 | @AoA = ( |
---|
338 | [ "fred", "barney" ], |
---|
339 | [ "george", "jane", "elroy" ], |
---|
340 | [ "homer", "marge", "bart" ], |
---|
341 | ); |
---|
342 | |
---|
343 | =head2 Generation of a ARRAY OF ARRAYS |
---|
344 | |
---|
345 | # reading from file |
---|
346 | while ( <> ) { |
---|
347 | push @AoA, [ split ]; |
---|
348 | } |
---|
349 | |
---|
350 | # calling a function |
---|
351 | for $i ( 1 .. 10 ) { |
---|
352 | $AoA[$i] = [ somefunc($i) ]; |
---|
353 | } |
---|
354 | |
---|
355 | # using temp vars |
---|
356 | for $i ( 1 .. 10 ) { |
---|
357 | @tmp = somefunc($i); |
---|
358 | $AoA[$i] = [ @tmp ]; |
---|
359 | } |
---|
360 | |
---|
361 | # add to an existing row |
---|
362 | push @{ $AoA[0] }, "wilma", "betty"; |
---|
363 | |
---|
364 | =head2 Access and Printing of a ARRAY OF ARRAYS |
---|
365 | |
---|
366 | # one element |
---|
367 | $AoA[0][0] = "Fred"; |
---|
368 | |
---|
369 | # another element |
---|
370 | $AoA[1][1] =~ s/(\w)/\u$1/; |
---|
371 | |
---|
372 | # print the whole thing with refs |
---|
373 | for $aref ( @AoA ) { |
---|
374 | print "\t [ @$aref ],\n"; |
---|
375 | } |
---|
376 | |
---|
377 | # print the whole thing with indices |
---|
378 | for $i ( 0 .. $#AoA ) { |
---|
379 | print "\t [ @{$AoA[$i]} ],\n"; |
---|
380 | } |
---|
381 | |
---|
382 | # print the whole thing one at a time |
---|
383 | for $i ( 0 .. $#AoA ) { |
---|
384 | for $j ( 0 .. $#{ $AoA[$i] } ) { |
---|
385 | print "elt $i $j is $AoA[$i][$j]\n"; |
---|
386 | } |
---|
387 | } |
---|
388 | |
---|
389 | =head1 HASHES OF ARRAYS |
---|
390 | |
---|
391 | =head2 Declaration of a HASH OF ARRAYS |
---|
392 | |
---|
393 | %HoA = ( |
---|
394 | flintstones => [ "fred", "barney" ], |
---|
395 | jetsons => [ "george", "jane", "elroy" ], |
---|
396 | simpsons => [ "homer", "marge", "bart" ], |
---|
397 | ); |
---|
398 | |
---|
399 | =head2 Generation of a HASH OF ARRAYS |
---|
400 | |
---|
401 | # reading from file |
---|
402 | # flintstones: fred barney wilma dino |
---|
403 | while ( <> ) { |
---|
404 | next unless s/^(.*?):\s*//; |
---|
405 | $HoA{$1} = [ split ]; |
---|
406 | } |
---|
407 | |
---|
408 | # reading from file; more temps |
---|
409 | # flintstones: fred barney wilma dino |
---|
410 | while ( $line = <> ) { |
---|
411 | ($who, $rest) = split /:\s*/, $line, 2; |
---|
412 | @fields = split ' ', $rest; |
---|
413 | $HoA{$who} = [ @fields ]; |
---|
414 | } |
---|
415 | |
---|
416 | # calling a function that returns a list |
---|
417 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
---|
418 | $HoA{$group} = [ get_family($group) ]; |
---|
419 | } |
---|
420 | |
---|
421 | # likewise, but using temps |
---|
422 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
---|
423 | @members = get_family($group); |
---|
424 | $HoA{$group} = [ @members ]; |
---|
425 | } |
---|
426 | |
---|
427 | # append new members to an existing family |
---|
428 | push @{ $HoA{"flintstones"} }, "wilma", "betty"; |
---|
429 | |
---|
430 | =head2 Access and Printing of a HASH OF ARRAYS |
---|
431 | |
---|
432 | # one element |
---|
433 | $HoA{flintstones}[0] = "Fred"; |
---|
434 | |
---|
435 | # another element |
---|
436 | $HoA{simpsons}[1] =~ s/(\w)/\u$1/; |
---|
437 | |
---|
438 | # print the whole thing |
---|
439 | foreach $family ( keys %HoA ) { |
---|
440 | print "$family: @{ $HoA{$family} }\n" |
---|
441 | } |
---|
442 | |
---|
443 | # print the whole thing with indices |
---|
444 | foreach $family ( keys %HoA ) { |
---|
445 | print "family: "; |
---|
446 | foreach $i ( 0 .. $#{ $HoA{$family} } ) { |
---|
447 | print " $i = $HoA{$family}[$i]"; |
---|
448 | } |
---|
449 | print "\n"; |
---|
450 | } |
---|
451 | |
---|
452 | # print the whole thing sorted by number of members |
---|
453 | foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) { |
---|
454 | print "$family: @{ $HoA{$family} }\n" |
---|
455 | } |
---|
456 | |
---|
457 | # print the whole thing sorted by number of members and name |
---|
458 | foreach $family ( sort { |
---|
459 | @{$HoA{$b}} <=> @{$HoA{$a}} |
---|
460 | || |
---|
461 | $a cmp $b |
---|
462 | } keys %HoA ) |
---|
463 | { |
---|
464 | print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n"; |
---|
465 | } |
---|
466 | |
---|
467 | =head1 ARRAYS OF HASHES |
---|
468 | |
---|
469 | =head2 Declaration of a ARRAY OF HASHES |
---|
470 | |
---|
471 | @AoH = ( |
---|
472 | { |
---|
473 | Lead => "fred", |
---|
474 | Friend => "barney", |
---|
475 | }, |
---|
476 | { |
---|
477 | Lead => "george", |
---|
478 | Wife => "jane", |
---|
479 | Son => "elroy", |
---|
480 | }, |
---|
481 | { |
---|
482 | Lead => "homer", |
---|
483 | Wife => "marge", |
---|
484 | Son => "bart", |
---|
485 | } |
---|
486 | ); |
---|
487 | |
---|
488 | =head2 Generation of a ARRAY OF HASHES |
---|
489 | |
---|
490 | # reading from file |
---|
491 | # format: LEAD=fred FRIEND=barney |
---|
492 | while ( <> ) { |
---|
493 | $rec = {}; |
---|
494 | for $field ( split ) { |
---|
495 | ($key, $value) = split /=/, $field; |
---|
496 | $rec->{$key} = $value; |
---|
497 | } |
---|
498 | push @AoH, $rec; |
---|
499 | } |
---|
500 | |
---|
501 | |
---|
502 | # reading from file |
---|
503 | # format: LEAD=fred FRIEND=barney |
---|
504 | # no temp |
---|
505 | while ( <> ) { |
---|
506 | push @AoH, { split /[\s+=]/ }; |
---|
507 | } |
---|
508 | |
---|
509 | # calling a function that returns a key/value pair list, like |
---|
510 | # "lead","fred","daughter","pebbles" |
---|
511 | while ( %fields = getnextpairset() ) { |
---|
512 | push @AoH, { %fields }; |
---|
513 | } |
---|
514 | |
---|
515 | # likewise, but using no temp vars |
---|
516 | while (<>) { |
---|
517 | push @AoH, { parsepairs($_) }; |
---|
518 | } |
---|
519 | |
---|
520 | # add key/value to an element |
---|
521 | $AoH[0]{pet} = "dino"; |
---|
522 | $AoH[2]{pet} = "santa's little helper"; |
---|
523 | |
---|
524 | =head2 Access and Printing of a ARRAY OF HASHES |
---|
525 | |
---|
526 | # one element |
---|
527 | $AoH[0]{lead} = "fred"; |
---|
528 | |
---|
529 | # another element |
---|
530 | $AoH[1]{lead} =~ s/(\w)/\u$1/; |
---|
531 | |
---|
532 | # print the whole thing with refs |
---|
533 | for $href ( @AoH ) { |
---|
534 | print "{ "; |
---|
535 | for $role ( keys %$href ) { |
---|
536 | print "$role=$href->{$role} "; |
---|
537 | } |
---|
538 | print "}\n"; |
---|
539 | } |
---|
540 | |
---|
541 | # print the whole thing with indices |
---|
542 | for $i ( 0 .. $#AoH ) { |
---|
543 | print "$i is { "; |
---|
544 | for $role ( keys %{ $AoH[$i] } ) { |
---|
545 | print "$role=$AoH[$i]{$role} "; |
---|
546 | } |
---|
547 | print "}\n"; |
---|
548 | } |
---|
549 | |
---|
550 | # print the whole thing one at a time |
---|
551 | for $i ( 0 .. $#AoH ) { |
---|
552 | for $role ( keys %{ $AoH[$i] } ) { |
---|
553 | print "elt $i $role is $AoH[$i]{$role}\n"; |
---|
554 | } |
---|
555 | } |
---|
556 | |
---|
557 | =head1 HASHES OF HASHES |
---|
558 | |
---|
559 | =head2 Declaration of a HASH OF HASHES |
---|
560 | |
---|
561 | %HoH = ( |
---|
562 | flintstones => { |
---|
563 | lead => "fred", |
---|
564 | pal => "barney", |
---|
565 | }, |
---|
566 | jetsons => { |
---|
567 | lead => "george", |
---|
568 | wife => "jane", |
---|
569 | "his boy" => "elroy", |
---|
570 | }, |
---|
571 | simpsons => { |
---|
572 | lead => "homer", |
---|
573 | wife => "marge", |
---|
574 | kid => "bart", |
---|
575 | }, |
---|
576 | ); |
---|
577 | |
---|
578 | =head2 Generation of a HASH OF HASHES |
---|
579 | |
---|
580 | # reading from file |
---|
581 | # flintstones: lead=fred pal=barney wife=wilma pet=dino |
---|
582 | while ( <> ) { |
---|
583 | next unless s/^(.*?):\s*//; |
---|
584 | $who = $1; |
---|
585 | for $field ( split ) { |
---|
586 | ($key, $value) = split /=/, $field; |
---|
587 | $HoH{$who}{$key} = $value; |
---|
588 | } |
---|
589 | |
---|
590 | |
---|
591 | # reading from file; more temps |
---|
592 | while ( <> ) { |
---|
593 | next unless s/^(.*?):\s*//; |
---|
594 | $who = $1; |
---|
595 | $rec = {}; |
---|
596 | $HoH{$who} = $rec; |
---|
597 | for $field ( split ) { |
---|
598 | ($key, $value) = split /=/, $field; |
---|
599 | $rec->{$key} = $value; |
---|
600 | } |
---|
601 | } |
---|
602 | |
---|
603 | # calling a function that returns a key,value hash |
---|
604 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
---|
605 | $HoH{$group} = { get_family($group) }; |
---|
606 | } |
---|
607 | |
---|
608 | # likewise, but using temps |
---|
609 | for $group ( "simpsons", "jetsons", "flintstones" ) { |
---|
610 | %members = get_family($group); |
---|
611 | $HoH{$group} = { %members }; |
---|
612 | } |
---|
613 | |
---|
614 | # append new members to an existing family |
---|
615 | %new_folks = ( |
---|
616 | wife => "wilma", |
---|
617 | pet => "dino", |
---|
618 | ); |
---|
619 | |
---|
620 | for $what (keys %new_folks) { |
---|
621 | $HoH{flintstones}{$what} = $new_folks{$what}; |
---|
622 | } |
---|
623 | |
---|
624 | =head2 Access and Printing of a HASH OF HASHES |
---|
625 | |
---|
626 | # one element |
---|
627 | $HoH{flintstones}{wife} = "wilma"; |
---|
628 | |
---|
629 | # another element |
---|
630 | $HoH{simpsons}{lead} =~ s/(\w)/\u$1/; |
---|
631 | |
---|
632 | # print the whole thing |
---|
633 | foreach $family ( keys %HoH ) { |
---|
634 | print "$family: { "; |
---|
635 | for $role ( keys %{ $HoH{$family} } ) { |
---|
636 | print "$role=$HoH{$family}{$role} "; |
---|
637 | } |
---|
638 | print "}\n"; |
---|
639 | } |
---|
640 | |
---|
641 | # print the whole thing somewhat sorted |
---|
642 | foreach $family ( sort keys %HoH ) { |
---|
643 | print "$family: { "; |
---|
644 | for $role ( sort keys %{ $HoH{$family} } ) { |
---|
645 | print "$role=$HoH{$family}{$role} "; |
---|
646 | } |
---|
647 | print "}\n"; |
---|
648 | } |
---|
649 | |
---|
650 | |
---|
651 | # print the whole thing sorted by number of members |
---|
652 | foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) { |
---|
653 | print "$family: { "; |
---|
654 | for $role ( sort keys %{ $HoH{$family} } ) { |
---|
655 | print "$role=$HoH{$family}{$role} "; |
---|
656 | } |
---|
657 | print "}\n"; |
---|
658 | } |
---|
659 | |
---|
660 | # establish a sort order (rank) for each role |
---|
661 | $i = 0; |
---|
662 | for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i } |
---|
663 | |
---|
664 | # now print the whole thing sorted by number of members |
---|
665 | foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) { |
---|
666 | print "$family: { "; |
---|
667 | # and print these according to rank order |
---|
668 | for $role ( sort { $rank{$a} <=> $rank{$b} } keys %{ $HoH{$family} } ) { |
---|
669 | print "$role=$HoH{$family}{$role} "; |
---|
670 | } |
---|
671 | print "}\n"; |
---|
672 | } |
---|
673 | |
---|
674 | |
---|
675 | =head1 MORE ELABORATE RECORDS |
---|
676 | |
---|
677 | =head2 Declaration of MORE ELABORATE RECORDS |
---|
678 | |
---|
679 | Here's a sample showing how to create and use a record whose fields are of |
---|
680 | many different sorts: |
---|
681 | |
---|
682 | $rec = { |
---|
683 | TEXT => $string, |
---|
684 | SEQUENCE => [ @old_values ], |
---|
685 | LOOKUP => { %some_table }, |
---|
686 | THATCODE => \&some_function, |
---|
687 | THISCODE => sub { $_[0] ** $_[1] }, |
---|
688 | HANDLE => \*STDOUT, |
---|
689 | }; |
---|
690 | |
---|
691 | print $rec->{TEXT}; |
---|
692 | |
---|
693 | print $rec->{SEQUENCE}[0]; |
---|
694 | $last = pop @ { $rec->{SEQUENCE} }; |
---|
695 | |
---|
696 | print $rec->{LOOKUP}{"key"}; |
---|
697 | ($first_k, $first_v) = each %{ $rec->{LOOKUP} }; |
---|
698 | |
---|
699 | $answer = $rec->{THATCODE}->($arg); |
---|
700 | $answer = $rec->{THISCODE}->($arg1, $arg2); |
---|
701 | |
---|
702 | # careful of extra block braces on fh ref |
---|
703 | print { $rec->{HANDLE} } "a string\n"; |
---|
704 | |
---|
705 | use FileHandle; |
---|
706 | $rec->{HANDLE}->autoflush(1); |
---|
707 | $rec->{HANDLE}->print(" a string\n"); |
---|
708 | |
---|
709 | =head2 Declaration of a HASH OF COMPLEX RECORDS |
---|
710 | |
---|
711 | %TV = ( |
---|
712 | flintstones => { |
---|
713 | series => "flintstones", |
---|
714 | nights => [ qw(monday thursday friday) ], |
---|
715 | members => [ |
---|
716 | { name => "fred", role => "lead", age => 36, }, |
---|
717 | { name => "wilma", role => "wife", age => 31, }, |
---|
718 | { name => "pebbles", role => "kid", age => 4, }, |
---|
719 | ], |
---|
720 | }, |
---|
721 | |
---|
722 | jetsons => { |
---|
723 | series => "jetsons", |
---|
724 | nights => [ qw(wednesday saturday) ], |
---|
725 | members => [ |
---|
726 | { name => "george", role => "lead", age => 41, }, |
---|
727 | { name => "jane", role => "wife", age => 39, }, |
---|
728 | { name => "elroy", role => "kid", age => 9, }, |
---|
729 | ], |
---|
730 | }, |
---|
731 | |
---|
732 | simpsons => { |
---|
733 | series => "simpsons", |
---|
734 | nights => [ qw(monday) ], |
---|
735 | members => [ |
---|
736 | { name => "homer", role => "lead", age => 34, }, |
---|
737 | { name => "marge", role => "wife", age => 37, }, |
---|
738 | { name => "bart", role => "kid", age => 11, }, |
---|
739 | ], |
---|
740 | }, |
---|
741 | ); |
---|
742 | |
---|
743 | =head2 Generation of a HASH OF COMPLEX RECORDS |
---|
744 | |
---|
745 | # reading from file |
---|
746 | # this is most easily done by having the file itself be |
---|
747 | # in the raw data format as shown above. perl is happy |
---|
748 | # to parse complex data structures if declared as data, so |
---|
749 | # sometimes it's easiest to do that |
---|
750 | |
---|
751 | # here's a piece by piece build up |
---|
752 | $rec = {}; |
---|
753 | $rec->{series} = "flintstones"; |
---|
754 | $rec->{nights} = [ find_days() ]; |
---|
755 | |
---|
756 | @members = (); |
---|
757 | # assume this file in field=value syntax |
---|
758 | while (<>) { |
---|
759 | %fields = split /[\s=]+/; |
---|
760 | push @members, { %fields }; |
---|
761 | } |
---|
762 | $rec->{members} = [ @members ]; |
---|
763 | |
---|
764 | # now remember the whole thing |
---|
765 | $TV{ $rec->{series} } = $rec; |
---|
766 | |
---|
767 | ########################################################### |
---|
768 | # now, you might want to make interesting extra fields that |
---|
769 | # include pointers back into the same data structure so if |
---|
770 | # change one piece, it changes everywhere, like for example |
---|
771 | # if you wanted a {kids} field that was a reference |
---|
772 | # to an array of the kids' records without having duplicate |
---|
773 | # records and thus update problems. |
---|
774 | ########################################################### |
---|
775 | foreach $family (keys %TV) { |
---|
776 | $rec = $TV{$family}; # temp pointer |
---|
777 | @kids = (); |
---|
778 | for $person ( @{ $rec->{members} } ) { |
---|
779 | if ($person->{role} =~ /kid|son|daughter/) { |
---|
780 | push @kids, $person; |
---|
781 | } |
---|
782 | } |
---|
783 | # REMEMBER: $rec and $TV{$family} point to same data!! |
---|
784 | $rec->{kids} = [ @kids ]; |
---|
785 | } |
---|
786 | |
---|
787 | # you copied the array, but the array itself contains pointers |
---|
788 | # to uncopied objects. this means that if you make bart get |
---|
789 | # older via |
---|
790 | |
---|
791 | $TV{simpsons}{kids}[0]{age}++; |
---|
792 | |
---|
793 | # then this would also change in |
---|
794 | print $TV{simpsons}{members}[2]{age}; |
---|
795 | |
---|
796 | # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2] |
---|
797 | # both point to the same underlying anonymous hash table |
---|
798 | |
---|
799 | # print the whole thing |
---|
800 | foreach $family ( keys %TV ) { |
---|
801 | print "the $family"; |
---|
802 | print " is on during @{ $TV{$family}{nights} }\n"; |
---|
803 | print "its members are:\n"; |
---|
804 | for $who ( @{ $TV{$family}{members} } ) { |
---|
805 | print " $who->{name} ($who->{role}), age $who->{age}\n"; |
---|
806 | } |
---|
807 | print "it turns out that $TV{$family}{lead} has "; |
---|
808 | print scalar ( @{ $TV{$family}{kids} } ), " kids named "; |
---|
809 | print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } ); |
---|
810 | print "\n"; |
---|
811 | } |
---|
812 | |
---|
813 | =head1 Database Ties |
---|
814 | |
---|
815 | You cannot easily tie a multilevel data structure (such as a hash of |
---|
816 | hashes) to a dbm file. The first problem is that all but GDBM and |
---|
817 | Berkeley DB have size limitations, but beyond that, you also have problems |
---|
818 | with how references are to be represented on disk. One experimental |
---|
819 | module that does partially attempt to address this need is the MLDBM |
---|
820 | module. Check your nearest CPAN site as described in L<perlmodlib> for |
---|
821 | source code to MLDBM. |
---|
822 | |
---|
823 | =head1 SEE ALSO |
---|
824 | |
---|
825 | perlref(1), perllol(1), perldata(1), perlobj(1) |
---|
826 | |
---|
827 | =head1 AUTHOR |
---|
828 | |
---|
829 | Tom Christiansen <F<tchrist@perl.com>> |
---|
830 | |
---|
831 | Last update: |
---|
832 | Wed Oct 23 04:57:50 MET DST 1996 |
---|