multidimensional array and garbage collections

Hi, I am new to unity and have come across and interesting problem and I was wondering if someone might help explain it further. I am seeing the following behaviour when I have a Google Tango Manager / Tango AR Camera in my scene which also has a basic game object with a script attached to create a multidimensional array. I am using Unity 5.3.5f.

My multidimensional array 200,80,200 in size (x,y,z). If I create this array and declare a class which contains only TRUE (I’ll explain that more in a second) value types (e.g. int, float, double, bool) and run the code then I see the GC coming along and cleaning up taking about around 45 ms.

However, if I add string to my class - the GC immediately jumps to 150+ a massive leap and its very noticible in my game. I did some digging and found out that if I add any reference types list,dictionay, my own class then I get similiar results to when I add the string. I read up a bit more and realised that string is actually a reference type NOT a value type. I also checked that the code wasnt calling the constructor multiple times and once the array was built, it was never destroyed or rebuilt.

What I want to understand more is why there is such a huge jump in the GC when I add a string (even empty) to my class which is in the array?

Is this just how long its taking GC to navigate through my array and to find the reference objects?

Whats actually going on here?

using UnityEngine;
using System.Collections;
using System.Text;
using System.Collections.Generic;

public class TestObjectClass
{
	public int myint{ get; set; }

	public Vector3 mypos { get; set; }

	public bool mybool { get; set; }

	public string mystring { get; set; }

	public float myfloat { get; set; }

	public Vector2 mysecondPos { get; set; }

	public Color mycolor{ get; set; }
}

public class CreateLargeArrayObjects : MonoBehaviour
{
	TestObjectClass[,,] myBigArrayObjects;

	/// <summary>
	/// Start is called on the frame when a script is enabled just before
	/// any of the Update methods is called the first time.
	/// </summary>
	public void Start ()
	{
		init ();
	}

	public void init ()
	{
		myBigArrayObjects = new TestObjectClass[200,80,200];
		for (int x = 0; x < 200; x++) {
			for (int y = 0; y < 80; y++) {
				for (int z = 0; z < 200; z++) {
					myBigArrayObjects [x, y, z] = new TestObjectClass ();
					myBigArrayObjects [x, y, z].myint = 1;
					myBigArrayObjects [x, y, z].mypos = new Vector3 (10, 10, 10);
					myBigArrayObjects [x, y, z].mysecondPos = new Vector2 (20, 20);
					myBigArrayObjects [x, y, z].mybool = true;
					myBigArrayObjects [x, y, z].myfloat = 1.004f;
					myBigArrayObjects[x,y,z].mycolor = new Color(1,0,0);
					myBigArrayObjects [x, y, z].mystring = "absdsg";
				}
			}
		}
	}
}

https://stackoverflow.com/questions/2423111/strings-and-garbage-collection

Check out this thread, it has some info about your case.

First of all are you aware of how many objects you create there and how much memory they consume?

80 * 20 * 80 == 3 200 000

So just your array alone will require 12.8MB on 32bit and 25.6MB on 64bit systems. But that’s not the worst thing as it’s a single chunk of memory. Now you also create 3.2M objects where each object has a size of:

//                               32  |  64
// -------------------------------------------
// -- class overhead --       -> 12  |  24
// public int myint           ->  4  |   4
// public Vector3 mypos       -> 12  |  12
// public bool mybool         ->  1  |   1
// public string mystring     ->  4  |   8
// public float myfloat       ->  4  |   4
// public Vector2 mysecondPos ->  8  |   8
// public Color mycolor       -> 16  |  16
// -------------------------------------------
//                               61  |  65

Note that objects in memory often has a 4 byte alignment but just ignore that for now. That means all the objects that you store in your array require 3.2M * 61B == 195.2MB. Single objects are much worse for the GC as every object has to be handled seperately.

However we’re not done yet. A string is a reference type. As such it can be null. In that case no additional memory is needed. However when you assign an actual string it’s another seperate object.

A string object also has an overhead. The size of a string is 12 + (1 + length/2)*4 Note that length/2 is an integer division so 1/2 == 0 and 3/2 == 1.

As example your string “absdsg” would require 12 + (1 + 3)*4 == 28 bytes of memory.

Now it gets tricky. In your case above all class instances would simply reference the same string object as this is a string constant which is an “interned” string. However when you dynamically “create” a string, that is using “+” to combine a string or if you use string.Format or “.ToString” of something you will create a seperate string for each object. So in this case you would also create 3.2M string objects as well.

In generally the GC has to track ever managed reference and check it when checking if an object can be collected. If your class doesn’t contain reference types it just has to check the class itself. However if it does contain reference types (like a string) the GC has to check if there’s an object behind that reference and if it is, check that object as well.

If each of your class has it’s own string with an average length of 8 you would have an additional memory usage of 3.2M * 32B == 102.4MB