Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support serializing executable suite into JSON #3902

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pekkaklarck opened this issue Mar 23, 2021 · 29 comments
Closed

Support serializing executable suite into JSON #3902

pekkaklarck opened this issue Mar 23, 2021 · 29 comments

Comments

@pekkaklarck
Copy link
Member

pekkaklarck commented Mar 23, 2021

It would be convenient to be able to serialize executable TestSuite objects into JSON that can be then also be used to recreate same suite later. Most important use cases are:

  • Easier to transfer tests between processes and machines. This would help, for example, https://github.com/MarketSquare/robotframework-cluster.
  • Possibility to save a suite, possible a nested suite, constructed from data on file system into a single file that is faster to parse.
  • Alternative data format for external tools generating tests.

In the subsequent comments I'll go through some design ideas I have related to this. Please use 👍 and 👎 to indicate do you like those ideas as well as this issue in general. Written comments obviously appreciated as well!

@pekkaklarck
Copy link
Member Author

I propose we add to_json method to TestSuite. It should by default return JSON string but it should also accept a path or an open file object where to save the JSON. It could also accept an option to specify should the object be serialized into a single JSON object or split to multiple. I write a separate comment about that.

We should also add from_json classmethod to TestSuite for constructing a suite based on JSON data.

We probably should add to_json and from_json also to TestCase, Keyword, If and For objects.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Mar 23, 2021

Serializing a big suite into JSON is problematic because reading it back requires deserializing the big JSON string first. To avoid problems with that, I propose we support splitting suites into multiple objects using this algorithm:

  • The top level suite is one object.
  • Suites, tests, setup and teardown a suite has are separate objects. Each separate object has a growing index. The parent suite has this index in the place of the real object.
  • All objects are stored into a single file, one object per line. This is not valid JSON but streaming JSON like this is pretty common. The format is also standardized but annoyingly there are two seemingly identical standards JSON Lines and Newline Delimited JSON. (The latter seems to be gone. See update 3 below.)
  • Child object index is got directly from the order they are serialized. The first child has index 1, the second index 2, and so on.

I propose we support both "one big object" and "split object" formats when construction model objects using from_json. The to_json should get an optional split argument. I think it should be False by default to produce valid JSON unless explicitly configured.

An example of a suite as one object:

{"name": "Suite", "doc": "Example", "tests": [{"name": "T1", "body": [{"type": "KEYWORD", "name": "Log", "args": ["Hi!"]}]}, {"name": "T2"}]}

Same suite split to multiple objects:

{"name": "Suite", "doc": "Example", "tests": [1, 2]}
{"name": "T1", "body": [3]}
{"name": "T2"}
{"type": "KEYWORD", "name": "Log", "args": ["Hi!"]}

Body items need a type to separate normal keywords from setups and teardowns as well as from IFs and FORs. We probably could omit the type with normal keywords (the common case) to make the file size smaller.

Above examples use spaces between items but I believe we should omit them to reduce file size.


UPDATE: Setups and teardowns are nowadays stored in model objects outside the body. There thus shouldn't be need save their type separately. Something like this ought to be clear enough:

{"name": "Suite", "setup": 1, ...}
{"name": "Log", "args": ["Hi!"], ...}

We probably needed to separate keywords from control structures such as IF, though, so possibly having types would be a good idea anyway.


UPDATE 2: Splitting JSON like this won't be implemented as part of RF 6.1 where the JSON serialization will be introduced. We can return to this later if/when there are need.


UPDATE 3: It seems that Newline Delimited JSON is gone (http://ndjson.org contains only gambling ads) so there are no competing streaming JSON standards anymore. If we look at streaming again, we can simply use JSON Lines.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Mar 23, 2021

I propose we use new .rbt extension when saving this format into a file. We cannot really use .json if we support also JSON Lines and having a custom extension also separates these files from other JSON files. We cannot use .robot because the format is different but we can consider .rbt to mean "condensed robot".

We should allow running robot suite.rbt as well as robot directory where the directory contains these files.

We should have a way to convert a file or a directory into a .rbt file. One easy way would be adding a command line option to robot that could be used like robot --serialize tests.rbt tests.robot. Being able to configure saved data like robot --name Smoke --include smoke --serialize smoke.rbt tests would be handy, but possible additional options controlling serialization would then be mixed with "normal" options. Alternatively we could add a separate entry point that could be called like python -m robot.serialize tests.rbt and enhancing the long neglected Testdoc tool (python -m robot.testdoc) to support this would be possible as well.

.rbt files saved using this mechanism should use the "split object" approach.


UPDATE: RF 6.1 will support .rbt files but there most likely won't be separate entry point for creating them. They can be created programmatically, though, and a separate entry point can be added later.

@pekkaklarck
Copy link
Member Author

Saved suite objects in .rbt files should also contain imports, variables and keywords same way as normal .robot files can have.

@pekkaklarck
Copy link
Member Author

External resource files could possibly be saved into a .rbt file as well. The best approach would be including all imported resources in it automatically but this probably should be configurable. The biggest problem would be designing how to change imports so that original example.resource would point to an embedded resource. This requires changes to the import machinery and probably needs a separate issue. Could also be implemented later.

@pekkaklarck
Copy link
Member Author

This issue is highly related to the proposal to add JSON based output format (#3423). Executable suite structure and result suite structure are similar, they even have common base, and they should use the same approach to serialize them to JSON. I think splitting a suite into multiple objects like I propose above ought to work fine also with result suites. They couldn't use the proposed .rbt extension, though, because they contain different data and we don't want robot directory to try running possible result files.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Mar 26, 2021

The biggest reason to split JSON into multiple objects is to avoid the need to read the whole file into memory and parse the whole JSON in one go. This may not be a huge problem with data, but if this same approach is used with results (#3423) it's very important because I've seen output.xml files that are over 1GB.

Assuming we have this JSON from the earlier example

{"name": "Suite", "doc": "Example", "tests": [1, 2]}
{"name": "T1", "body": [3]}
{"name": "T2"}
{"type": "KEYWORD", "name": "Log", "args": ["Hi!"]}

it could be processed like this:

  1. Open the file and iterate over it like for line in file to read only one line at a time.
  2. Process the first line. Construct TestSuite object with specified name and doc. Tests are not specified directly (there are integers, not objects) so add callbacks to add them when child objects with id 1 and 2 are encountered.
  3. Process the second line. This line has implicit id 1 and thus a callback created earlier is called with this data creating a TestCase object that's added to the earlier suite. Tests body has one item referred with id 3. This creates another callback.
  4. Process the second line (id 2). Call relevant callback to created another TestCases.
  5. Process the last line (id 3). Call relevant callback to created Keyword that's added to the first test.

We could extend this mechanism to strings so remove duplication. For example, the first line of the above example could be changed to this:

{"name": 1, "doc": 2 "tests": [3, 4]}
"Suite"
"Example"

This way if we ever need to use string "Suite" again, we just use 1 in place of it. This particular string may not be used that often and the string is short, but, for example, keyword documentation strings can be long and very often repeated. Possibly we should use this approach only with long strings and with strings we know are used very often like "PASS" and "KEYWORD".

We use similar approach, created years ago by @mkorpela and @jussimalinen, to remove duplication in log.html and report.html and there it has worked great. There we also zip long strings but I'm not sure is that needed here. It's easier and probably better to zip the whole file when needed instead. There we also represent all timestamps as integers representing milliseconds as a difference to a base time set only once. That means that instead of using e.g. 20210326 19:55:01.123 and 20210326 19:55:01.124, we'd just have base time 20210326 19:55:01.123 once and timestamps would be just 0 and 1. This isn't relevant at all with data but something to consider with results.

@pekkaklarck
Copy link
Member Author

Too big task for RF 4.1. Hopefully can be included in RF 5.0.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Oct 19, 2021

Were you @mkorpela and @Snooz82, or someone else, working with something that would benefit from this feature? If yes, we need to make sure this makes it into RF 5.0. We could start with a design discussion session on Slack sometime soon. If nobody needs this in the near future, it might ne better to postpone this to RF 5.1 to ensure we get RF 5.0 out in time. Either way, I consider this an important enhancement and want to see it land reasonably soon.

@awasekhirni
Copy link

Using TestSuiteBuilder, i have been able to generate json from existing testsuite. It would be nice to have the whole testsuites stored to json object prior to running or post running of the complete suite to build a complete "object hierarchy". Further more, examples for model transformation CRUD operations should be present in the documentation. TestSuite helps us create new testcases dynamically, why not give complete example demos of them for users to explore and also a possibility to save TestSuite as ".robot" or json format again for reusability in future.
There should also be a possibility to push it to NOSQL store as testsuite object later on.

@mkorpela
Copy link
Member

@pekkaklarck I had an idea at one point to built a reporting package to npm and JS/node world that could take current format js arrays + logic of log html in and give the package user opportunity to use the format.
This would allow creating custom views to logs with react and other modern web technologies.

@awasekhirni
Copy link

Wishing list as a robotframework/RPA development

  1. Ability to save TestSuite files (dynamically generated as) as '.robot' and '.json' for reuse.
  2. robot.api documentation and crud operations examples would help many people completely automate the automation process
  3. Ability to save the objecthierarchy prior to execution as a json file with all the resources, testcases, pageobjects,keywords, webelementlocators as a JSON file

@awasekhirni
Copy link

Another item i missed out is

  1. Ability to write the ".robot" file as a json file. When RIDE supports "double tabbed" and "pipe delimiter" format for '.robot' file creation. Why not allow the user to create'.robot' file in json format and allow him to execute the same. this is industry wide standard and would prove beneficial to all making automation of automation more easier

@pekkaklarck
Copy link
Member Author

@mkorpela, this issue is about saving the running side model to JSON that could then be later easily parsed by Robot and executed. There's separate #3423 about output.json i.e. results in JSON format.

@pekkaklarck
Copy link
Member Author

@awasekhirni, this issue covers most of the features you want. Somehow including also referenced resource and variable files or even libraries is out of its scope. Once this issue is done, we can look at those enhancements as well.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Nov 16, 2021

I've been lately thinking this issue a bit because we could possibly use the same underlying logic to convert model objects to dictionaries (that can then later trivially be converted to JSON) could possibly also used elsewhere. For legacy reasons there's still code that considers everything tests contains keywords although there can be FOR loops, IF/ELSE structures, and after RF 5.0 also RETURN, BREAK, CONTINUE, and TRY/EXCEPT objects. The current non-keyword objects have attributes that allow using them as keywords, but that requires quite a bit of code here and there and is ugly in general. I want this compatibility code to be removed sooner rather than later, but there still should be a way to somehow uniformly operate all "body items" tests can contain.

A solution I've been thinking is adding method to_dict(as_keyword=False) to all body items. By default they'd return all their data as a dictionary with structure unique to that particular object type. If as_keyword=True would be used, they'd return information as a dictionary containing same keys as keyword objects have. This would allow working with all these objects "as keywords" if needed.

If the aforementioned to_dict is implemented to all body items tests can have, we could easily add to_dict also to tests and suites. Then it would be trivial to add to_json to all these that would just call to_dict and convert the result to JSON. That would mean half of this issue would be done. The other half would be adding from_dict and from_json class methods for constructing model objects.


UPDATE: Data side objects (robot.running) aren't anymore masked to be keywords so this issue isn't anymore relevant.

@pekkaklarck
Copy link
Member Author

At the moment it seems unlikely this issue will make it to RF 5.0. There are other more important issues and when they are done we probably want to create the release so that we get them into real use.

pekkaklarck added a commit that referenced this issue Jan 10, 2023
- Support loading JSON using open file or file path.
- Support serializing to open file or file path.
- Customizable JSON formatting. Defaults differ from what ``json``
  uses by default.
pekkaklarck added a commit that referenced this issue Jan 15, 2023
@pekkaklarck pekkaklarck self-assigned this Feb 8, 2023
@pekkaklarck pekkaklarck changed the title Support serializing executable test suite into JSON Support serializing executable suite into JSON Mar 15, 2023
@pekkaklarck
Copy link
Member Author

This is the most important RF 6.1 issue because there are so many interesting usages for it in the ecosystem. Basic functionality is ready and will be part of the forthcoming RF 6.1 alpha 1. This is what works:

  1. You can serialize a suite structure into JSON by using TestSuite.to_json method. When used without arguments, it returns the JSON as a string, but it also accepts a path or an open file where to write it along with configuration options related to JSON formatting. Example:

    from robot.api import TestSuite
    
    suite = TestSuite.from_file_system('path/to/tests')
    suite.to_json('tests.rbt')
  2. You can create a suite based on JSON data using TestSuite.from_json. It works both with JSON strings and paths to JSON files. Example:

    from robot.api import TestSuite
    
    suite = TestSuite.from_json('tests.rbt')
  3. When using robot normally, it parses .rbt files automatically. This includes running individual JSON files like robot tests.rbt and running a directory containing them.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Mar 15, 2023

Remaining tasks to get this done:

  • Handling resource files.
    • Support serializing resource files to JSON and using them with "normal" .robot tests. This requires deciding on a suitable JSON resource file extension. *.rsrc perhaps? UPDATE: Resource files can use either *.rsrc or *.json extension.
    • Support including resource files into the suite JSON. This requires deciding should we do it always or should it be opt-in. This may also turn out to be too complicated to do in RF 6.1 timeframe. UPDATE: This is left for later.
  • Write documentation. API docs ought to be already pretty good, but this needs to be documented in the User Guide as well.
  • Test execution of .rbt files.
  • Check are tests for serialization and deserialization adequate.
  • Decide do we need external entry point for creating .rbt files. I believe it's not needed at this point, but if others think it's a good idea and have an idea how it should work, this certainly can still be considered. UPDATE: Left for later.

Anything else that's still missing? Any opinions about handling resource files? Ping especially @manykarim.

@luv-deluxe
Copy link

I know it might be a bit late now, but perhaps once model is loaded from JSON (*.rbt), one could update its variable with e.g.

model = TestSuite().from_json()
model.update_variable(var="myvar", value=15)
model.update_variable(var="listvar", value=["Bill", 0, None])

# it updates model body, but one can imagine that saved ROBOT would look like below
# ***SETTINGS***
# ${myvar}    ${15}
# @{names}    Bill    ${0}    ${None}

As I remember, variables are stored as Variable() objects having value attribute that returns a tuple.
It would be helpful for robotframework-cluster or other distributed execution. Serialized ROBOT is sent to a worker, that updates variables read from other place (database, environment or configmap) and executes it directly. No need for variable assignment via CLI or variables file.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 24, 2023

You can manipulate TestSuite.resource.variables, but it's not too convenient. That API could be enhanced, but I believe you typically want to pass variables like this to TestSuite.run and that works already now.

pekkaklarck added a commit that referenced this issue May 9, 2023
- Add typing. Part of #4570.

- Enhance `config` to convert values to tuples if original attribute
  is a tuple. This preserves tuples returned from `to_dict` in
  `to_json/from_json` roundtrip (#3902). Alternative would be using
  `@setter` with all attributes containing tuples, but it's easier to
  handle them here.

- Enhance `config` to require attributes to exist. This enhances error
  reporting.
pekkaklarck added a commit that referenced this issue May 9, 2023
- Add type hints. Part of #4570.

- Make attributes containing normal sequences (i.e. not our custom
  objects like Tags) explicitly tuples in `__init__`.

- Perserve tuples as tuples, instead of converting to lists, also in
  `to_dict` (related to #3902). That means less work and smaller
  memory usage. Earlier change to `ModelObject.config` makes sure
  tuples are preserved over `to_json/from_json` roundtrip.

- Use tuples attributes also with `robot.running.model.UserKeyword`.
  That module will get types and be enhanced in the near future.
pekkaklarck added a commit that referenced this issue May 24, 2023
- Support JSON resource files. Possible extensions are `.json` and `.rsrc`.
- Handle failures in parsing JSON files gracefully.
- Tests.
@pekkaklarck
Copy link
Member Author

The above commit added support for JSON resource files and enhanced error handling if a parsed JSON file is invalid. With resource files it is possible to use either *.json or *.rsrc extension.

@pekkaklarck
Copy link
Member Author

There are currently the following limitations:

  1. The source path is stored in absolute format. This causes problems if data is moved to another machine with a different directory layout. The main problem is that imports that use relative paths are relative to the source. This is a pretty bad limitation, but I'm not sure what's the best way to handle it. Possibly to_dict/json could allow making the source relative to a certain directory, and from_dict/json could then turn the relative path to an absolute path based on where the source file is located. This could possibly still be fixed in RF 6.1 but I don't consider it mandatory.
  2. Resource and variable files imported by a suite cannot be bundled with it. Fixing this would require making it possible to resolve imports without executing tests/tasks. Too much work for RF 6.1.
  3. ${CURDIR} doesn't work. Fixing it would basically require fixing ${CURDIR} should be normal variable #596 and that's too much work for RF 6.1.
  4. Documentation is missing. This obviously must be fixed before RF 6.1.

The limitations listed above above need to mentioned in the documentation.

pekkaklarck added a commit that referenced this issue May 29, 2023
Also enhance the Selecting files to parse section in general. The
section now covers also `.json` and `.rbt` extensions even though the
JSON format (#3902) isn't otherwise documented yet.
pekkaklarck added a commit that referenced this issue May 31, 2023
Makes it possible to make suite source relative and to add a custom root to it.
This is especially useful when moving data around as JSON (#3902).

Also add docstring to `Metadata`.
@pekkaklarck
Copy link
Member Author

In the previous comment I mentioned that absolute suite source is problematic. The commit above added TestSuite.adjust_source that can be used to make absolute source relative and to add a new root to a relative source. That's especially convenient when moving JSON data around. This example is from the docstring of the new method:

from robot.running import TestSuite                      
                                                         
# Create a suite, adjust source and convert to JSON.     
suite = TestSuite.from_file_system('/path/to/data')      
suite.adjust_source(relative_to='/path/to')              
suite.to_json('data.json')                               
                                                         
# Recreate suite elsewhere and adjust source accordingly.
suite = TestSuite.from_json('data.json')                 
suite.adjust_source(root='/new/path/to')                 

After this enhancement, I consider JSON serialization support to be good enough for RF 6.1. Documentation is still missing, but I'll write it after the first release candidate is out.

pekkaklarck added a commit that referenced this issue Jun 9, 2023
Also minor changes to the JSON model itself. Most importatnly, fix
`While` to include `on_limit`.
pekkaklarck added a commit that referenced this issue Jun 11, 2023
Includes:
- Using JSON suite files.
- Using JSON resource files.
- Using reST resource files.
- Recommend `.resource` with normal resource files more strongly.
- List all supported extensions under Registrations.

Also document reST resource files.
@pekkaklarck
Copy link
Member Author

pekkaklarck commented Jun 11, 2023

Commits listed above have added a JSON schema as well as documentation in the User Guide. Creating JSON resource files was made more convenient by #4793. We can finally consider this issue done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants