Even for the best designed surveys, you may need to do additional data cleaning and refactoring. You can make changes to data in a tool like Excel, then upload into Protobi. But there is also the option to change survey data programmatically within the app.
You can add processes to:
- create a new variable
- remove respondents
- combine waves
- merge in translations
- stack patient cases, choice cards, etc.
- combine
month
andyear
into a single date
Why data clean in Protobi?
One way to change data is in SPSS, SAS, R, Excel, or another tool to edit the data prior to uploading it to Protobi as a data table. This might be the easiest solution for one-time changes. But you can do some serious data processing in Protobi as well.
One advantage of making changes to data in Protobi is that all your processing code is in one place, neatly organized, with changes tracked. So if you get new data, or later need to review the changes you made, it's all visible in one place.
Protobi uses JavaScript to process data
Protobi can execute code written in JavaScript. JavaScript is a powerful first-class language comparable to C, Perl, and Python which can do many things (and possibly do them more easily than you could in SPSS or SAS).
Add a data process
In the data tab of project settings press the green "+ New process" button to add a data process.
You will be prompted to give the process a name:
From the process table, press "Edit/Run".
You will be taken to the code-view page. There is default return data code that simply returns whatever data file in the "main" table.
Beneath that there are some comments, and some pre-written code with two backslashes at the front of the code. Slashes in front of code means it will not work and will be read as a comment.
Delete the code in line one. Then, delete the double slashes (//) from the front of the pre-written code. It should like like screenshot below.
Run code on every respondent row
From here, you can start adding javascript inside the "rows.forEach" function which runs the code within it on every row in the data. In most cases that means every respondent record, but it some cases it might be every patient case record, etc. It depends if your data has stacked survey loops.