Update hashtag regex
If you stare into the abyss long enough you can find that Twitter's hashtag regex is exceedingly complex to catch edge cases like forgetting to put a space at the end of a sentence's punctuation: e.g., "hey guys this is cool.#awesome"
That's crazy. But you can find more details about it here: https://github.com/twitter/twitter-text/
The reality is that Twitter's hashtags have evolved over the years from a base ruleset to include some unicode chars and other things to be friendly to non-English speaking users. But it's really, really complicated. If we go back to Old School Twitter rules or "What English speaking users encounter", the rules are pretty simple:
- Alpha, Numeric, and underscores are allowed (no dashes or other punctuation!)
- The hashtag must start with an Alpha character
This regex change should match these expectations. This will ensure that #100 doesn't get recognized as a hashtag, but #go100 will.
This fixes #1 (closed)