-
Couldn't load subscription status.
- Fork 3
Add dream.load_geant4_csv #203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for the implementation @jl-wynen . Can the files contain event weight? If the test file is meant to contain events, then the detector is illuminated completely uniformly. Can we publish the test data? If not, how do we test the loader? Should this combine module, segment, and counter into a 'subsegment' as in Should we compute voxel positions from the event positions (average over events per voxel). As it stands, there is no voxel coordinate. The NeXus loader produces a detector_number coord. We cannot do this here. Is this ok? _The loader in this PR makes a position coord for each event because that is needed by the instrument view and coord transforms. Should we drop {x,y,z}pos? Or should we not produce position? |
src/ess/dream/io/geant4.py
Outdated
|
|
||
|
|
||
| def _group(dg: sc.DataGroup) -> sc.DataGroup: | ||
| return dg.group('counter', 'segment', 'module', 'strip', 'wire') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct order? I thought "strip" should be innermost (indexing voxels along a wire). Not sure about the others either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose the order based on the respective dim-lengths where the longest is on the inside. I'm happy to change it if it makes more sense to follow a physical order. @celinedurniak ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think wire is the longest actually, there are only 32 or so? Strips should be many more. Regardless, I think we should follow the logical order (I presume detector_number in NeXus files will also come in logical order)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of strips and wires depends on the "detector bank"
- Mantle: 256 strips, 32 wires
- Endcap backward and forward: 16 strips, 16 wires.
- HR: 32 strips, 16 wires
Strips are numbered along z axis. For the wires, it depends on the geometry (radially for the mantle and the endcaps). For the endcaps, there are also the SUMOs.
src/ess/dream/io/geant4.py
Outdated
| def _load_raw_events(filename: Union[str, os.PathLike]) -> sc.DataArray: | ||
| table = sc.io.load_csv(filename, sep='\t', header_parser='bracket', data_columns=[]) | ||
| table = table.rename_dims(row='event') | ||
| return sc.DataArray(sc.ones(sizes=table.sizes), coords=table.coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@celinedurniak Should we add variances for the weights (all ones)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jl-wynen I think this is still missing?
src/ess/dream/io/geant4.py
Outdated
| detector_groups: sc.DataArray, detector_id_name: str, detector_id: sc.Variable | ||
| ) -> Optional[sc.DataArray]: | ||
| try: | ||
| return detector_groups[detector_id_name, detector_id].value.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this copy necessary? Later you anyway group by component names, so another copy will be made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing gets grouped after this. In the case of mantle and high res, the object returned here is returned to the user.
I can change it such that it doesn't copy the endcaps because they are concatenated anyway.
| if (det := _extract_detector(groups, detector_id_name, i)) is not None | ||
| ] | ||
| if endcaps_list: | ||
| endcaps = sc.concat(endcaps_list, data.dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concat (and thus copy) could be avoided by using bin instead of group above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of splitting the endcaps, and using concat to merge them again, based on:
MANTLE_DETECTOR_ID = sc.index(7)
HIGH_RES_DETECTOR_ID = sc.index(8)
ENDCAPS_DETECTOR_IDS = tuple(map(sc.index, (3, 4, 5, 6)))Make bin edges as sc.array(dims=[detector_id_name], values=[3,7,8,9], unit=None_ or something like that? Then use groups = data.bins(edges).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... but to be honest I do not know if that is really faster. So I'd say: leave it as it is, until we have evidence saying otherwise.
src/ess/dream/io/geant4.py
Outdated
| endcap_forward = endcaps[endcaps.coords['z_pos'] > sc.scalar(0, unit='mm')] | ||
| endcap_backward = endcaps[endcaps.coords['z_pos'] < sc.scalar(0, unit='mm')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boolean indexing might be quite slow, have you considered using bin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. And I would want to have a benchmark to see if this is actually faster because it needs multiple passes over the events, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does nit need multiple passes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I thought because it needs to find min and max. But we can just use +/- Inf.
Then it still needs to bin + copy each bin.
src/ess/dream/io/geant4.py
Outdated
| return { | ||
| key: val | ||
| for key, val in zip( | ||
| ('mantle', 'high_resolution', 'endcap_forward', 'endcap_backward'), | ||
| (mantle, high_res, endcap_forward, endcap_backward), | ||
| ) | ||
| if val is not None | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems simpler to just init the dict step by step above, avoiding, e.g., also the else for the endcap handling?
1741819 to
d396ba4
Compare
cde2050 to
e0a95cc
Compare
a5cf776 to
f2411b3
Compare
f2411b3 to
9b458bb
Compare
| endcaps = endcaps.bin( | ||
| z_pos=sc.array( | ||
| dims=['z_pos'], | ||
| values=[-np.inf, 0.0, np.inf], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this assumes that the origin is always in the middle of the detectors? i.e. the sample position. Is that always the case in Geant4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In GEANT4, the origin is at the sample position
| 'mantle', | ||
| 'high_resolution', | ||
| 'endcap_forward', | ||
| 'endcap_backward', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a file that I got from @celinedurniak there were more entries than this: see #184 (comment)
Do we need to support both or is this now the new standard layout we will get?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the SUMOs in that file correspond to the endcaps. Is this correct, @celinedurniak ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jl-wynen you are correct. SUMOs are sub-divisions of the endcap detectors (i.e., the concentric partial rings) numbered from 3 to 6
src/ess/dream/io/geant4.py
Outdated
| def _load_raw_events(filename: Union[str, os.PathLike]) -> sc.DataArray: | ||
| table = sc.io.load_csv(filename, sep='\t', header_parser='bracket', data_columns=[]) | ||
| table = table.rename_dims(row='event') | ||
| return sc.DataArray(sc.ones(sizes=table.sizes), coords=table.coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jl-wynen I think this is still missing?
Fixes #177
Mostly mimics the structure produced by #199 for NeXus files.
I tested it with
data_dream_HF_mil_closed_alldets_1e9.csvprovided as part of the requirements page on Confluence. The older file used as part of the tutorial in Scipp here does not work as it does not contain adet IDcoordinate. This is required to split the data into the separate instrument components.Here is what the loaded data looks like:
There are some open questions:
dream.load_nexusfor latest NeXus files #199?detector_numbercoord. We cannot do this here. Is this ok?positioncoord for each event because that is needed by the instrument view and coord transforms. Should we drop{x,y,z}_pos? Or should we not produceposition?