At Orgtheory, Fabio asked about how to identify substrings within text fields in Stata. Although this is a seemingly simple proposal, there is one big problem, as Gabriel Rossman points out: Stata string fields can only hold 244 characters of text. As Fabio desires to use this field to analyze scientific abstracts, then 244 characters is obviously insufficient.
Gabriel Rossman has posted a solution he has called
grepmerge that uses the Linux-based program
grep to search for strings in files. This is a great solution, but it comes with one large caveat: it cannot be used in a native Windows environment. This is because the
grep command ...