Make the changelog parser more lenient, please!

Post your ideas and suggestions how to improve the game.

Moderator: ickputzdirwech

Pi-C
Smart Inserter
Smart Inserter
Posts: 1725
Joined: Sun Oct 14, 2018 8:13 am
Contact:

Re: Make the changelog parser more lenient, please!

Post by Pi-C »

badtouchatr wrote: Thu Apr 11, 2019 7:32 am Pi-C: I agree 99.9999993% (sorry, rounding error ;) )

I'm going to reserve judgment until I make the next update, but I'm leaning towards one of these possibilities:

- something doesn't work right for the very first time you upload a new mod, or
- the parser is just acting moody.
:-)
A good mod deserves a good changelog. Here's a tutorial (WIP) about Factorio's way too strict changelog syntax!
Pi-C
Smart Inserter
Smart Inserter
Posts: 1725
Joined: Sun Oct 14, 2018 8:13 am
Contact:

Re: Make the changelog parser more lenient, please!

Post by Pi-C »

Bilka wrote: Thu Apr 11, 2019 7:34 am As a sidenote, some of the errors have already been improved a bit. I changed the "no colon after category error" to instead say "category line does not end in colon", because that is what it is actually checking - a space after the colon will throw this error. The "duplicate date" error now says "duplicate date or version" and the basically empty "error on line x" now says what it expects the line to start with.
Thanks, that will help a lot!
A good mod deserves a good changelog. Here's a tutorial (WIP) about Factorio's way too strict changelog syntax!
orzelek
Smart Inserter
Smart Inserter
Posts: 3922
Joined: Fri Apr 03, 2015 10:20 am
Contact:

Re: Make the changelog parser more lenient, please!

Post by orzelek »

I looked at the RSO changelog that is maintained on mod portal description page and then at the changelog tutorial.... and I'm sorry thats not happening.
Game has more strict requirements for changelog then any of programming languages I'm using. It's actually easier to code mod then a changelog.

While I see benefit of changelogs in game and I might start a new one from some version - conversion of whole thing seems like a lot of work and error messages with similar level of helpfulness as in C++ compiler when using nested templates and a lot of std types.

Anyone plannign to write some kind of changelog preparation tool where you could enter versions and text and it would format the whole thing?
badtouchatr
Long Handed Inserter
Long Handed Inserter
Posts: 80
Joined: Sat Aug 20, 2016 8:00 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by badtouchatr »

orzelek wrote: Thu Apr 11, 2019 5:51 pm Anyone plannign to write some kind of changelog preparation tool where you could enter versions and text and it would format the whole thing?
That actually sounds like an awesome idea, and I would even be willing to do that, or help with that. With the stipulation, of course, that we get complete documentation from the devs on the changelog parser rules, which I believe Bilka is working on. :)
JAetherwing
Inserter
Inserter
Posts: 34
Joined: Sat Dec 29, 2018 2:51 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by JAetherwing »

Rseding91 wrote: Wed Apr 10, 2019 10:20 pm So, I read all of the posts so far here. I agree the error(s) should be more explicit about what is incorrect. I don't agree that any of them should be made more lenient.

Factorio doesn't do "it's close, so I'll fix it" logic - the thing is either correct or incorrect - and in the case of changlog files they're incorrect if they don't match the exact formatting required by the changlog system.

We specifically don't allow any variance because there's no reason for it. If the thing is wrong... then it's wrong.
I agree with this regarding ambiguous syntax or changelogs that do something different that the parser expects.

But please, enlighten me on why on earth the parser requires exactly 99 dashes for a divider and fails for a divider made from, say, 80 dashes. Especially since this is not documented anywhere.

I'm totally on your side that the changelogs should be in a standardized syntax, but those arbitrary restrictions are just ludicrous.
Rseding91
Factorio Staff
Factorio Staff
Posts: 14250
Joined: Wed Jun 11, 2014 5:23 am
Contact:

Re: Make the changelog parser more lenient, please!

Post by Rseding91 »

JAetherwing wrote: Fri Apr 12, 2019 5:32 pm I agree with this regarding ambiguous syntax or changelogs that do something different that the parser expects.

But please, enlighten me on why on earth the parser requires exactly 99 dashes for a divider and fails for a divider made from, say, 80 dashes. Especially since this is not documented anywhere.

I'm totally on your side that the changelogs should be in a standardized syntax, but those arbitrary restrictions are just ludicrous.
Look at the changelog section here: https://www.boost.org/users/history/version_1_70_0.html for the "Beast" section. THAT is why we force one and exactly one format on the entire changlog.
If you want to get ahold of me I'm almost always on Discord.
User avatar
BlueTemplar
Smart Inserter
Smart Inserter
Posts: 3031
Joined: Fri Jun 08, 2018 2:16 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by BlueTemplar »

bobingabout wrote: Tue Apr 09, 2019 3:42 pm I think the fact that it specifically has to be UTF-8 without BOM, or it FAILS is pretty brutal.
If your text editor doesn't by default save text as UTF-8 (without BOM), then it's your text editor that is to blame.
Even Microsoft ended by making it default in Windows Shell & Notepad
(though internally it still uses an obsolete version of Unicode).
(The update for WIndows 10 should be available this spring.)

jamiechi1 wrote: Wed Apr 10, 2019 3:08 pm I have been using notepad++ for years and had to set it to utf-8 (without BOM) to edit html and javascript properly. So I haven't seen some of the issues others see.
I believe the reason for the differences in how things are parsed versus in-game, is due to the different environments.
The web page most likely uses javascript (in html) and the game uses c++ internally. (Different parsers)
Yeah there might be an issue with C++, which handles UTF-8 poorly :
https://stackoverflow.com/questions/171 ... 5#17106065

And it looks like it doesn't even handle it the same way depending on what OS it was compiled on / is being run on ?
https://alfps.wordpress.com/2011/11/22/ ... pproaches/
https://alfps.wordpress.com/2011/12/08/ ... ream-mode/

(so, C++ might choke on a BOM ?)

jamiechi1 wrote: Wed Apr 10, 2019 3:08 pm There definately needs a formal document somewhere to explicitly define the required document mark up required.
White space, including linefeeds should be ignored in processing the change log information, just as it is ignored for the most part in many languages such as Lua and C++.

Maybe XML should be used for the change logs. This will make it easier to define exactly where things go and what they are.

Maybe a better option is to make it simple like windows used to do in an 'ini' file. An example of a simple 'ini' file format is what they use in the fallout games ini files. Square brackets to delineate sections and simple text with no need to worry about encoding styles.

Keep it simple.
Orv wrote:XML proved handily that it's possible to make something verbose and inefficient for computers without actually making it human-readable.
(Also, in my experience, while XML in theory now supports UTF-8, hardly any XML tool / library does...)

The go-to human-readable format these days seems to be YAML :
Image
(the current changelog syntax might already be a stricter version of it ?)
BobDiggity (mod-scenario-pack)
Pi-C
Smart Inserter
Smart Inserter
Posts: 1725
Joined: Sun Oct 14, 2018 8:13 am
Contact:

Re: Make the changelog parser more lenient, please!

Post by Pi-C »

Bilka wrote: Thu Apr 11, 2019 7:34 am As a sidenote, some of the errors have already been improved a bit. I changed the "no colon after category error" to instead say "category line does not end in colon", because that is what it is actually checking - a space after the colon will throw this error. The "duplicate date" error now says "duplicate date or version" and the basically empty "error on line x" now says what it expects the line to start with.
Just noticed another new error message:

Code: Select all

invalid changelog file, error on line 1, line does not start with exactly '    - ' or exactly '      '
Thanks for implementing it, that should be very helpful!

For the sake of an example, I'll document a complete debug session for a very basic changelog file. Let's see if I can get it to work by following the error messages. That's what I will start with:

Code: Select all

v0.0.1
-ported to 0.17
It shows this error:

Code: Select all

invalid changelog file, error on line 1, line does not start with exactly '    - ' or exactly '      '..
Not quite as expected, as an error on line 1 always means that the first changelog entry doesn't start with a proper header line. But just relying on the error message, I'll correct that anyway:

Code: Select all

    - v0.0.1
-ported to 0.17
The parser now reports:

Code: Select all

invalid changelog file, error on line 1, missing category.
Let's add one:

Code: Select all

Info:
    - v0.0.1
-ported to 0.17
I still get the same error:

Code: Select all

invalid changelog file, error on line 1, line does not start with exactly '    - ' or exactly '      '
However, this line ends with a colon, so it should be regarded as an incorrect category line. The error message is misleading because category lines must be indented with only two spaces. With the current error message, we'd just end up in a vicious circle. Let's try again, this time with a correct category line:

Code: Select all

  Info:
    - v0.0.1
-ported to 0.17
Now it's better:

Code: Select all

invalid changelog file, error on line 1, missing version
I'll just move the version line up, and because it didn't work previously, I fiddle around a bit and end up with a correct Version line:

Code: Select all

Version: 0.0.1
  Info:
-ported to 0.17
Doesn't help though, we're back where we started:

Code: Select all

invalid changelog file, error on line 1, line does not start with exactly '    - ' or exactly '      '..
In my assumed role as a mod author, I give up at this point. :-)


Let's start all over again:

Code: Select all

v0.0.1
-ported to 0.17
Now, let's assume I get a message like this:

Code: Select all

invalid changelog file, error on line 1, line is not a valid header line.
So I add a header line:

Code: Select all

--------------------------------------------------------------------------------------------------
v0.0.1
-ported to 0.17
I still get an error, because this header line contains only 98 dashes. :-) So, it would make sense to make the error message more explicit:

Code: Select all

invalid changelog file, error on line 1, line is not a valid header line (must only contain exactly 99 dashes)
I finally have figured out what's wrong, so the file now is:

Code: Select all

---------------------------------------------------------------------------------------------------
v0.0.1
-ported to 0.17
This time, the error message is actually helpful again:

Code: Select all

invalid changelog file, error on line 2, missing Version: line.
So I take the clue and correct line 2:

Code: Select all

---------------------------------------------------------------------------------------------------
Version: 0.0.1
-ported to 0.17
Now I get

Code: Select all

invalid changelog file, error on line 3, line does not start with exactly '    - ' or exactly '      '..
The missing category isn't mentioned yet, but let's just ignore that for now and fix the error:

Code: Select all

---------------------------------------------------------------------------------------------------
Version: 0.0.1
    - ported to 0.17
Ah, here's the expected message:

Code: Select all

invalid changelog file, error on line 3, missing category
So we add a category:

Code: Select all

---------------------------------------------------------------------------------------------------
Version: 0.0.1
Info:
    - ported to 0.17
There's still an error:

Code: Select all

invalid changelog file, error on line 3, line does not start with exactly '    - ' or exactly '      '..
This line is supposed to be a Category line, however, because it ends with a colon. But the patterns suggested by the current error message are not correct in this case. Something like

Code: Select all

invalid changelog file, error on line 3, line does not start with exactly '  ' or exactly  '    - ' or exactly '      '.
would be better. We could then proceed with

Code: Select all

---------------------------------------------------------------------------------------------------
Version: 0.0.1
  Info:
    - ported to 0.17
and the changelog would be parsed without an error.

Summary:
An error on line 1 should always result in

Code: Select all

invalid changelog file, error on line 1, line is not a valid header line (must only contain 99 dashes)
An error message for lines after the Version line should also include the 2-space indention for Category lines:

Code: Select all

invalid changelog file, error on line x, line does not start with exactly '  ' (Category: )or  '    - ' or exactly '      ' (list of entries below a category)
I've not enough time now for more testing with more entries, but the suggested changes should improve your changes to the error messages even further! :-)

Edit: Had to tag the error messages as code because a series of spaces is reduced to one space otherwise. If only there was a tag that would allow inline code (prints everything between start and end tag as is, preserving multiple spaces, but doesn't add an extra box like the code tag) in a post! :-)
A good mod deserves a good changelog. Here's a tutorial (WIP) about Factorio's way too strict changelog syntax!
Trebor
Filter Inserter
Filter Inserter
Posts: 292
Joined: Sun Apr 30, 2017 1:39 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by Trebor »

I made a sed script that can be used to clean up some problems with change logs. It was inspired by the sed snippets from the original post. Since I don't have GNU's sed this uses a Posix sed.

Code: Select all

# Clean up tabs, spaces and blank lines.
s/	/        /g
s/ +$//
/^$/d

# Clean up lines just containing dashes.
/^ +-+$/s/.*/----/
/^-+$/s/-+/---------------------------------------------------------------------------------------------------/
${/^-+$/d
  # Note detecting dashes on the last line only works if there are no blank lines after!
}

# Make sure the first line is a header, all headers are 99 dashes and the last line is not a header.
1{/^-+$/!i\
---------------------------------------------------------------------------------------------------
}

# Fix version and date lines.
/^ *[Vv][Ee][Rr][Ss][Ii][Oo][Nn][ :]/{
  s/^ *[Vv][Ee][Rr][Ss][Ii][Oo][Nn] *:? *(.+)$/Version: \1/
  b
}
/^ *[Dd][Aa][Tt][Ee][ :]/{
  /^ *[Dd][Aa][Tt][Ee] *:?$/d
  s/^ *[Dd][Aa][Tt][Ee] *:? *(.+)$/Date: \1/
  b
}

# Make sure other lines are indented correctly.
s/^ *([^-].+)/      \1/
/^-+$/!s/^ *- *(.+)$/    - \1/

# Clean up any categories.
s/^ *(( *[^ :])+) *: *-+$/  \1:/
/^ *[^:]+ *: *-+/{
  h
  s/^[^:]+: *- *(.+)/    - \1/
  x
  s/^ *(( *[^ :])+) *:.*$/  \1:/
  G
}
If we run it against this ugly change log:

Code: Select all

   versION : 0.1.0

Date:
    Date : 2019/07/22

  category          :-

othercat    : - stuff

hello
------------
Using the command: sed -Ef cleanup.sed changelog

We get this new change log:

Code: Select all

---------------------------------------------------------------------------------------------------
Version: 0.1.0
Date: 2019/07/22
  category:
  othercat:
    - stuff
      hello
Edit: Found a bug, updated sed script.
Last edited by Trebor on Sat Jul 27, 2019 4:19 pm, edited 1 time in total.
User avatar
ssilk
Global Moderator
Global Moderator
Posts: 12889
Joined: Tue Apr 16, 2013 10:35 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by ssilk »

The trend for many newer programming language is to have a pretty printer included. You run the prettifier over your code and it formats everything as your (company, team, whatever) rules tell. Works also for text-formats. Even for JSON there are lots of online-pretty-printers. E.g. https://jsonformatter.curiousconcept.com/

Same could be implemented as a service for that Changelog file onto the mods page.
Cool suggestion: Eatable MOUSE-pointers.
Have you used the Advanced Search today?
Need help, question? FAQ - Wiki - Forum help
I still like small signatures...
Trebor
Filter Inserter
Filter Inserter
Posts: 292
Joined: Sun Apr 30, 2017 1:39 pm
Contact:

Re: Make the changelog parser more lenient, please!

Post by Trebor »

ssilk wrote: Thu Jul 25, 2019 6:18 pm The trend for many newer programming language is to have a pretty printer included. You run the prettifier over your code and it formats everything as your (company, team, whatever) rules tell. Works also for text-formats. Even for JSON there are lots of online-pretty-printers. E.g. https://jsonformatter.curiousconcept.com/
This was done not so much for pretty printing it for the mod portal, but to get the formatting correct so Factorio would accept it.
Post Reply

Return to “Ideas and Suggestions”