Compound Data
Assumed Knowledge
Learning Outcomes
- Understand how larger chunks of data are stored in a program
- Understand how to access these larger chunks of data
- Understand the most common data chunks, arrays
So far we have been working only with atomic data. That is data that is “just one thing” and can’t be broken up. This is getting unwieldy. Everytime we have something to draw, we need to have an x-position and a y-position and so we need two variables for something we probably think of as just one thing (the position on the screen).
Wouldn’t it be nice to have just one variable for both? Yes, yes it would.
The facility we need for this is composite data. We need a way to put two values (the x and the y positions) into a single variable. What type would this be? Not an int
or a float
or a char
! Processing does have types we can use for this, but they act a little differently from the types we already know.
Let me introduce you to one of these types, PVector
.
Recall that processing has values (like 1,c,4.5) that are grouped into types (like int
, char
, float
). It is possible to have a value with two numbers in it, that value would be of the type PVector
:
- (1,1) is a
PVector
with two values in it,1
and1
- (10,4) is a
PVector
with two values in it,10
and4
.
So you can see that we have a whole new type and a whole new set of values that fit into that type. Lets play with it.
The Processing website has an example of using PVectors
for a bouncing ball program. Study it carefully.
You will notice two interesting features that we will now discuss:
- the use of “dot-notation” to get inside the compound data
- the use of
new
for an extra step in the declare-and-initialise process.
Compound Data in Memory
So, compound data is bigger, so much so, that it can’t fit into a single memory slot. But variables can only point to single buckets, so what do we do?
For compound data, a variable will point to one memory location that then tells you where the larger chunk of memory is.
int[] myArray;
myArray = new int[5];
So, an extra step is required, making the memory box point to another, larger, chunk of memory. This is what new
does. It finds some free space that is big enough, and sets aside that much space. Lets call this step “memory allocation”.
The other new step is to “follow the arrow” from the memory box to the bigger chunk and to pull out one part from there. This is what the .
does. It follows the arrow and then uses the special name that comes after to get one of the values from the larger chunk. Lets call this step “de-referencing”.
Processing actually has two different forms of compound data - objects like PVector
(and many others), and arrays. We will focus on arrays from here but feel free to use objects whenenver you like.
Note: objects are often taught with a related concept - classes - however, we will not address classes in any way at this point.
Note that the two different versions of compound data come with two different syntaxes for dereferences - for objects is it .name
and for arrays it is [number]
. See below for details.
Arrays
To understand how to use compound data, we will look at the most common compound data type, arrays.
An array is a fixed-sized collection of items, each item of the same type.
So you can have,
- an array of 5 integers
- an array of 20 booleans
- an array of 2000 characters
- an array of 50 Strings
- an array of 400 arrays
- and so on…
Syntax to create an array
There are a few ways to create an array.
Creating array - Option 1
The most common way to create an array is by specifying the type and size of the array.
type[] arrayName = new type[size];
Example
int[] arr = new int[5]; //an array that holds 5 integers
boolean[] flags = new boolean[8]; //an array that holds 8 booleans
Creating array - Option 2
When you know the values that need to be stored in the array beforehand, you can create an array as,
type[] arrayName = {item1, item2, ....};
Example
int[] cutoffs = {50, 65, 75, 85};
char[] punctuations = {'.', '!', '?', ',', ';', ':'};
Size of an array
The number of items in an array arr
is given by arr.length
.
Note that, given what we said above this means .length
should be read as “follow the reference and see how large the chunk of memory at the other end is”.
For example,
int[] data = new int[20];
println(data.length); //displays 20
Accessing items of an array
- The first item of an array
arr
is at index 0. - The second item of an array
arr
is at index 1. - …
- The last item of an array
arr
is at indexarr.length - 1
.
Thus you can traverse an array using a loop from 0
to arr.length - 1
as,
for(int i=0; i < arr.length; i++) {
//do what you want with arr[i]
}
Example
Create an array that holds the outcome of 20 dice rolls.
Each dice roll is a random integer between 1 and 6.
int[] outcomes = new int[20];
for(int i=0; i < outcomes.length; i++) {
outcomes[i] = (int)random(1, 7); //remember 7 is not included
}
Then we can find out the average outcome as,
int total = 0;
for(int i=0; i < outcomes.length; i++) {
total += outcomes[i];
}
double average = (total * 1.0)/outcomes.length;
//multiplication with 1.0 to convert int to double
Storage of an array and representation
For the next example, note that an integer takes up 4 bytes of memory.
The array items are held together in a block.
Consider the following array,
int[] arr = new int[4];
If the first item is at memory address 200-203, the second item will be at 204-207, the third item will be at 208-211 and the last item from 212-215.
The array itself holds the starting address, in this case 200. That’s why we say that an array is a reference.
We represent this as,
or a more simplified way of adding the array label above the items.
Example
boolean[] flags = {true, false, false, true, false};
Exercises
Exercise 1
Draw the memory diagram for the following code,
int[] data = new int[]{50, 90, 30, 20, 60};
Exercise 2
Draw the memory diagram for the following code,
int[] data = new int[6];
for(int i=0; i < data.length; i++) {
if(i%2 == 0) {
data[i] = 2*i+1;
}
else {
data[i] = -3*i;
}
}
Exercise 3
Write a piece of code that creates the arrays represented in the following diagram.
Exercise 4
Draw the memory diagram for the following code,
int[] a = {4,8,15,16,23,42};
int[] b = a;
b[2] = 100;
b = new int[]{104,101,108,108,111};
Exercise 5
Write a piece of code that creates an array with 100 integers, each between 1 and 20. Count the number of items that are divisible by the item after them.
For example, if the first few items of the array are {20, 5, 8, 4, 4, ...}
, 20 is divisible by 5, 8 is divisible by 4, and 4 is divisible by 4, so we have 3 such items so far.
Exercise 6
Consider the following piece of code.
int[] items = new int[10];
items[0] = 1;
for(int i=1; i < items.length; i++) {
items[i] = 2 * items[i-1];
}
What are the contents of the array items
?
Furthering your understanding
Coding in the real life
The ages of 20 people is stored in an array ageList
. Write a piece of code that determines the range of the distribution. That is, the age difference between the oldest and the youngest person.
Solution
//assuming ageList is a boolean array containing 20 items
int min = ageList[0];
for(int i=1; i < ageList.length; i++) {
if(ageList[i] < min) {
min = ageList[i];
}
}
int max = ageList[0];
for(int i=1; i < ageList.length; i++) {
if(ageList[i] > max) {
max = ageList[i];
}
}
int range = max - min;
The state of 25 electrical switches is held in an array smartSwitches
. The states can be “On” (true) or “Off” (false). Write a piece of code that toggles all switches. That is, all switches that are currently “On” should turn “Off”, and all switches that are currently “Off” should turn “On”.
Solution
//assuming smartSwitches is a boolean array containing 25 items
for(int i=0; i < smartSwitches.length; i++) {
smartSwitches[i] = !smartSwitches[i];
}
Assume that we are encoding birthdays as integers where:
- 1st january is 0,
- 2nd january is 1,
- and so on till,
- 31st december is 365
(remember that 29th february will also be given an integer mapping).
The birthdays of 1000 people is stored in an array bdays
. Write a piece of code that displays the most frequent date of birth. For the basic version, you may display an integer between 0 and 365. For the advanced version, display the actual data in DD/MM
format.
Solution
//assuming bdays is an array containing 1000 items
//such that each item is in range [0...365]
/*
frequencies holds number of birthdays on each of the 366 days.
if a birthday falls on 1st jan (coded as 0), frequencies[0] increases by 1
if a birthday falls on 2nd jan (coded as 1), frequencies[1] increases by 1
...
if a birthday falls on 32st dec (coded as 365), frequencies[365] increases by 1
*/
int[] frequencies = new int[366];
for(int i=0; i < bdays.length; i++) {
/*
int idx = bdays[i];
frequencies[idx]++; //another birthday on that day
}
int mostFrequentBday = 0; //1st jan
for(int i=1; i < frequencies.length; i++) {
if(frequencies[i] > frequencies[mostFrequentBday]) {
mostFrequentBday = i;
}
}
println(mostFreqBday);
//advanced:
int[] daysInMonths = {31,29,31,30,31,30,31,31,30,31,30,31};
int currentMonth = 0;
while(mostFreqBday < daysInMonths[currentMonth]) {
mostFreqBday = mostFreqBday - daysInMonths[currentMonth];
}
int dayInMonth = mostFreqBday+1;
println(dayInMonth+"/"+currentMonth);
I keep track of the time taken (in minutes) to run each kilometer over a 100km race (let me dream!). These times are stored in an array lapTimes
. For example, if lapTimes[0]
is 6.23, it means I ran the first kilometer in 6 minutes 13.8 seconds.
Write a piece of code that determines my fastest lap (for example, display “Kilometer 0-1” if the first kilometer was the fastest, and “Kilometer 63-64” if the 64th kilometer was the fastest.)
Solution
//assuming lapTimes is a float array containing 100 items
int fastestLap = 0;
for(int i=1; i < lapTimes.length; i++) {
if(lapTimes[i] < lapTimes[fastestLap]) {
fastestLap = i;
}
}
print("Kilometer "+fastestLap+"-"+(fastestLap+1));