In particular, 32 bits limits you to 16MB strings, 31 bit integers, and 4,194,303 element arrays. Bigarrays aren’t subject to these limits, and will be limited by the amount of memory that your process can address.
In addition to @jfester perfectly correct reply, let me add that the operating system and execution environment used may further restrict how much memory a single process can use. The natural limit for a 32-bit platform is 4 Gb, but I remember some Linux versions with a 3 Gb limit, Windows versions with a 2 Gb limit, and Cygwin versions with a 1 Gb limit.
If you can use a 64-bit platform, by all means go for it, as it will avoid all these silly limitations.
Typically when dealing with large files like this, it’s best to process them in a “streaming” manner. So that you only hold a fraction of the file in memory at any one time.
I haven’t worked with XML in years, and when I did, I wrote my own parsers, so I cant’ point at particular software. @ttamttam has made some suggestions. More generally, there are (for XML) DOM parsers and streaming parsers. You want the latter.
This is (or was) a very active area of development for Java, so you might find good reading material in the Java world.