-
Notifications
You must be signed in to change notification settings - Fork 42
Description
We have metrics on OIE downloads. We do not have good information about how many running installations of OIE there are or what features are being used. This lack of data impacts the project. Having data about the installed systems will help us win support from corporate sponsors. Having data about features that are being used (or not used) will help us prioritize work.
This telemetry will be opt-in It will should never be on by default. Forcing users to register or even share anonymous statistics is wrong.
This telemetry will be non-intrusive failures to collect or send telemetry will not impact the operation of the application. Opting out of sending telemetry will have no affect on features.
I request community feedback about:
- is "telemetry" the right word? We're not capturing metrics, we're capturing information about the number of installations
- what data can we collect to identify unique installations and protect privacy? IPs? Server ID?
- what compliance issues do we need to consider? Is a GDPR guru available to give us advice
- what data is OK to collect?
- what data should we publish? Raw data? Aggregate stats?
- how often do we collect data? daily? weekly?
- when/how do we prompt for collection? I do not want to spam users with another popup. I hate those and my GH history shows that.
- what other open-source projects have good data collection practices that we can learn from? The Linux Foundation has a policy writeup
My starting idea is to collect the following information:
- OIE version
- Java version
- OS
- a sufficiently anonymous identifier for the instance - This could be done with fingerprinting, I know NextGen uses oshi. It could be done by IP or server ID. See above questions about what is OK to use
- a count of channels
- a count of deployed channels
- a count of types of connectors used in channels and deployed channels (eg 5 channels w/ HTTP, 6 channels w/ MLLP, 2 channels w/ SOAP but not deployed)
- a statistic about common settings - examples is pruning enabled, how many connectors use non-default queue settings, do TCP connectors use raw TCP or MLLP, what file writer types are in use
- This information will also be shown to the user for their instance(s). This is useful data to know about ones own instance of OIE
- This will likely be implemented as a plugin
Finally - I know the collection of usage stats is not well tolerated by the open-source community and not by privacy focused healthcare orgs. I hope my past comments and contributions to OIE and contributions to the user privacy plugin show my commitment to privacy and responsible use of data.