There are letter replacements or swaps used in the English language and informal language such X used for ex, K used in place of c, and others that could cause issues for natural language processing.
As with all endeavors, data is messy and morphs of language are no exception. This article doesn’t solve these issues in NLP, nor have I spent any time looking into pre-built solutions. It is only meant to provide examples, links, and overview of the situation.
Wikipedia lists these situations as:
K for C examples
Drop the C in CK example
- Nestle Quik/Nesquik (quick)
- X can represent “trans-” (e.g. XMIT for transmit, XFER for transfer)
- “cross-” (e.g. X-ing for crossing, XREF for cross-reference)
- “Christ-” as shorthand (e.g. Xmas for Christmas, Xian for Christian)
- the “crys-” in crystal (XTAL)
- various words starting with “ex-” (e.g. XL for extra large, XOR for exclusive-or)