Regular expression for searching “near” text (i.e., within X words)

The Chrome function for finding text on a webpage (Ctrl-F) is fine, but it doesn’t allow finding one word “near” another one. For example, if the text on the page has many instances of “this” and many instances of “that,” using Ctrl-F to find a spot where it says “this and also that” is tedious.

The Chrome plug-in “Regex Search” allows regular expressions for searching the text on a webpage. Regular expressions are powerful, but they can be a pain to create.

Continue reading “Regular expression for searching “near” text (i.e., within X words)”

SPSS code for calculating the confidence interval of a Pearson correlation

For some reason, SPSS does not offer an option to calculate the confidence interval of an observed value of a Pearson correlation. SAS does it, and so does Stata. But SPSS doesn’t do it. Of course, SPSS will calculate the correlation itself. However, it will not calculate the confidence interval of the correlation.

The SPSS syntax below calculates the confidence interval.  I’ve drawn on some code from the SPSS website; I’ve made the code easier to use and the results easier to interpret. I have verified the calculations against what I get using Stata, and the syntax calculates the confidence intervals correctly.
Continue reading “SPSS code for calculating the confidence interval of a Pearson correlation”

VBA code for performing many searches-and-replaces in an Excel file

I had a long list of search-and-replace tasks that I needed to do on a text file. I wanted an easy-to-use tool, so I put together something in Excel using VBA. The tool starts with a “BEFORE replacing” worksheet and then creates an “AFTER replacing” worksheet. The replacements are performed based on a “Definitions of Replacements” worksheet, which directs the macro to replace everything in the first column with a paired item in the second column.
Continue reading “VBA code for performing many searches-and-replaces in an Excel file”

The effect of trimming outliers on a mean

It is well known that means are sensitive to outliers. Recently, I was asked what the effect would be on the mean if outliers from the distribution were to be removed from the analysis. Leaving aside questions of how to identify outliers and whether they should be trimmed, the question of how the mean would change if outliers are trimmed is addressed with this formula:

Consider an example with the original mean = 5.5 and the original sample size = 250. If three observations with “1” values were trimmed and another three observations with “7” values were trimmed, the change in mean would be (5.5 * 6 – 24) / (250 – 6) = .037. Therefore, the mean would increase by .037 (i.e., from 5.5 to 5.537).

 
– Eric DeRosia

Make-good HIT for MTurk Glitches

Every once in a while, I get an e-mail from a MTurk worker who says they experienced a technical glitch with my HIT and they couldn’t submit it successfully.  I can usually tell it’s not a scam because they mention things from the HIT itself.  (At a purely rational level, it’s probably not worth their time to write long e-mails about a missing $0.59 HIT reward, but such things can become emotional for some workers.)
Continue reading “Make-good HIT for MTurk Glitches”

Javascript for Qualtrics: performing mathematical operations on quantitative responses

In Qualtrics, I needed to take a quantitative survey response, do some mathematical operations to the response, and then (1) display the output to the survey respondent and (2) write the output to the Qualtrics dataset. It turns out this is a little tricky, but as described in this post, Javascript can do this task nicely. Continue reading “Javascript for Qualtrics: performing mathematical operations on quantitative responses”