Jump to content

What a Strings.xml file should look like, for those of you who are editing them hard.


Recommended Posts

So, I've been digging through the guts of several mods, trying to figure out how to merge them, and I've come to a distressing conclusion.

Metadata sucks!

The core of modding is using a text editor which can do a text diff; I suspect most people use Notepad++ for this, though there are alternatives.

You know what metadata does to a text diff? It makes it weep, it makes it scream, it makes it cry uncle. When you're doing a text diff, you only want to see actual differences in the files you're working with, not every single cell where one person who last edited the file used Arial and where the other used Microsoft Sans-Serif.

Not to mention that it bloats your file size significantly. How significantly? About twice as much.

I've spent the last hour cleaning up Khall's Carbines's Strings.xml. It started at over 12,000Kb, though much of that is because his spreadsheet program must have glitched somewhere and repeated the same few cells in a row approximately ten bajillion times. Clean up the glitched info, and you get a file that's been pared down to about 640K. Clean up the blasted metadata, though - the NamedCell nonsense, the style info, etcetera - and you can get the sucker down to 390. That's not liposuction, that's taking a chainsaw to the file.

So, what should a properly edited Strings.xml file look like?

<?xml version="1.0"?><Workbook><Worksheet ss:Name="strings"> <Table>  <Column/>  <Column/>  <Row>   <Cell><Data ss:Type="String">Column Name</Data></Cell>   <Cell><Data ss:Type="String">Column Contents</Data></Cell>  </Row>... </Table></Worksheet></Workbook>

And honestly, I'm not even 100% that <Workbook> and <Worksheet> are necessary, but I ain't gonna strip 'em. But that's it - you should have a <Row> tag, containing two <Cell> tags, which contain a <Data ss:Type="String"> tag, which contains your info. Not only does this make things much lighter in the file size and much, much easier for whatever text diff program you're going to use in the end, but it enhances human readability by approximately over nine thousand percent. Once you've pared most of the file down this way, you can just hold the page down key to scan the file and look for bits that are obviously out-of-place. For example, just by visual scanning, I detected a long section beginning with Items.AlienBaseCoreDesc and proceeding to Researches.HeavyDroneWreckageDesc where the second cell is missing entirely, and I'll have to look into that.

Oh, and one last tip?

<!-- This is a comment. -->

That's the comment tag. Everything ignores it, it's just for human readability, which makes it the most perfect thing ever for someone tinkering in the guts of an XML file to leave themselves (or future modders) notes as regards what they've seen which they thought was strange in the file, or (perhaps more importantly) done to the file. Documenting your work now will save you massive headaches later, I promise you that.

(Plus, it makes it very easy to ctrl-F on the string "<!--" to find things you've seen before.)

Edited by ShadowDragon8685
Link to comment
Share on other sites

I just looked at all the files I'm going to have to clean...

I despaired for a while, and then I stepped up my game. By which I mean I whined to a friend until he told me how to use Notepad++'s powerful regular expressions to do what I want them to do.

Get the latest version of Notepad++, it's presently 6.6.8. I mean, this will probably work with versions newer than 5, but why take a chance?

Note: When they updated Notepad++ past the fives, the plugincompare plugin - you know, that thing we all got Notepad++ for in the first place - broke syntax highlighting. To fix this, Google Notepad++ Compare Plugin 1.5.6.2 and install the updated version that unbreaks syntax highlighting.

Anyway, here's what you do:

Use

<Row[^>].*

To find and select ROW elements with stuff in them. Replace it with <Row>, replace them all. If, for some reason, the file you're working on has the <Row> elements on the same line as other elements, do not do this, as this is lazy and greedy and will gobble up EVERYTHING on the line.

Do not have Matches Newline checked or you are in for a baaad day!

That's part of the battle down. Now!

Use

<Cell[^>].*?[^/]>

To selectively select CELL elements with BS in them. Note that this will SKIP self-closing CELL Elements; do normal searches to find them and kill them MANUALLY.

What's a self-closing element? It's one that looks like this

<Cell ss:Style="Bullshit"/>

The key is the />. But now that the other Cell elements have been normalized, a simple expression can take care of them.

Plug <Cell[^>].*?> into your search for, and replace all with <Cell/>.

Bing badda boom, bob's your uncle.

Edited by ShadowDragon8685
Link to comment
Share on other sites

This is all nice, but if you want to step up your game, why don't you rather spend your time on merging mods the way that is not short of being obsolete?

http://www.goldhawkinteractive.com/forums/showthread.php/11433-Documentation-Modular-mods-system

Three reasons.

One: I'm an obstinate pig-headed jackass and once I've set myself to something I'm either going to finish it or get frustrated, angry at everything it represents, and sod off to sulk for a while.

Two: I'm still trying to get this all to work just right so I can play a bloody full game the way I want it to play.

Three: I've developed a personal vendetta against this metadata BS and I want to see it scourged from the files we're distributing around these parts.

Link to comment
Share on other sites

Oh, and I've missed one important thing, too: Named Cell!

Sometimes you'll find elements like this:

<NamedCell ss:Name="_FilterDatabase"/>

They're annoying. Ctrl-H them and replace them with an empty string to be rid of them. But sometimes, you'll find them on more than one line, like this:

  <Row>   <Cell><Data ss:Type="String">Abandon</Data></Cell>   <Cell><Data ss:Type="String">Abandon mission</Data><NamedCell   ss:Name="_FilterDatabase"/></Cell>  </Row>

I hate that, don't you? Fortunately, there's a solution. Go to the find/replace dialoge again, and move back up one radio button, from Regular Expressions to Extended, and plug in the following:

<NamedCell\r\n ss:Name="_FilterDatabase"/>

Again, replacing it with the empty string will do you good. Find and replace all, preferably in all your open documents. Ctrl-F and look for NamedCell again. If you see it again, you'll have to adjust the search string. Each "\r\n" tells Notepad++ to select across another line break (Return, Newline, I guess.) Remember you need to select the white-space, too, when you're copy-pasting the second line part of NamedCell to plug into Notepad++.

Happy hunting!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...