Sentence Case: automatically capitalize sentences
Pascal Getreuer, 2022-11-30 (updated 2024-01-27)
Overview
The first letter of each sentence is capitalized in typical English writing and other languages with a case distinction. This post describes a QMK userspace feature that automatically applies Shift to capitalize when starting a new sentence. This reduces how often you need to use the Shift keys, which is convenient particularly if you use home row mods or Auto Shift.
To use it, you simply type as usual but without shifting at the start of sentences. The feature detects when new sentences begin and capitalizes automatically.
Example: The color highlighting indicates how the following input is interpreted.
Include these details when reporting a bug. what steps produce the problem? describe your configuration e.g. non-default settings, extensions, etc. where relevant. please attach any error logs to this form.
Sentence ending
Sentence
start
Abbreviation
The output produced is:
Include these details when reporting a bug. What steps produce the problem? Describe your configuration e.g. non-default settings, extensions, etc. where relevant. Please attach any error logs to this form.
Add it to your keymap
If you are new to QMK macros, see my macro buttons post for an intro.
Step 1: In keymap.c, define or add to your
process_record_user()
function to call
process_sentence_case()
:
#include "features/sentence_case.h"
bool process_record_user(uint16_t keycode, keyrecord_t* record) {
if (!process_sentence_case(keycode, record)) { return false; }
// Your macros ...
return true;
}
Note: If you happen to also use custom shift keys, be sure to
call process_sentence_case()
before
process_custom_shift_keys()
.
Step 2: In your rules.mk
file, add
SRC += features/sentence_case.c
Step 3: In the directory containing your
keymap.c
, create a features
subdirectory and
copy sentence_case.h
and sentence_case.c
there.
Note: One-shot keys must be enabled to use Sentence
Case. One-shot keys are enabled by default, but can be disabled by
#define NO_ACTION_ONESHOT
in config.h. If your config.h
includes such a line, please remove it.
Troubleshooting: If your keymap fails to build, a
likely reason is that your QMK installation simply needs to be updated.
If you have the qmk_firmware git repo cloned locally, do a
git pull
. Or see Updating
your master branch for more details.
How to use it: With the above done and flashed to your keyboard, use Sentence Case simply by typing as normal but without shifting at the start of sentences. For example, typing
hey. hello.
should produce
hey. Hello.
The feature kicks in after seeing a sentence ending, so the
h
in hey
might not be capitalized.
Overriding Sentence Case: It is possible that
Sentence Case false-triggers and capitalizes when it isn’t wanted. This
happens especially with abbreviations having a single period at the end,
like “misc.
”, which to the simple detection rule looks like
a sentence ending (while the rule correctly recognizes abbreviations
like “e.g.
” containing intermediate periods as not real
endings). To override false triggers:
You can intervene manually: if Sentence Case falsely capitalizes a letter, you can backspace and re-type the letter.
If you regularly use an abbreviation where Sentence Case false triggers, define an exception for it in
sentence_case_check_ending()
. See Defining exceptions.If Sentence Case isn’t wanted in some circumstances, maybe while gaming or writing code, you can disable it temporarily with
sentence_case_off()
and later turn it back on withsentence_case_on()
. See the Functions below.
Alternatives
Instead of Sentence Case, an alternative method to capitalize the start of sentences is the next sentence macro, which types a period, space, then sets a one-shot mod so that the next key typed is shifted. This could be initiated with macro button, or as in precondition’s keymap through a tap dance. Compared to Sentence Case, the main difference is the macro is explicitly activated at the end of the sentence instead of detecting automatically.
Auto-capitalization of sentences is a built-in feature in some editors, including Microsoft Office, Libre Office, and Google Docs. In Emacs, you can use auto-capitalize.el or Xah Lee’s xah-upcase-sentence. In Vim, a method is described by David Moody.
Sentence detection
Detecting sentences is more involved than it might first seem. To use this feature effectively, it helps to know how the detection works.
Let’s look again at this example:
Include these details when reporting a bug. what steps produce the problem? describe your configuration e.g. non-default settings, extensions, etc. where relevant. please attach any error logs to this form.
Sentence ending
Sentence
start
Abbreviation
A new sentence is detected when the following sequence occurs:
- One or more letter characters (a word).
- Followed by sentence-ending punctuation
.
?
or!
. The following exceptions are considered abbreviations instead of real sentence endings:- Word containing multiple periods, like
U.S.
ori.e.
. - The common abbreviations
vs.
andetc.
are specifically recognized (see below for how to define further exceptions).
- Word containing multiple periods, like
- Followed by one or more spaces.
- Followed by a letter character
⇒ Sentence start detected! The letter is shifted to capitalize.
To support quoted sentences, quotes '
and "
are allowed in any position.
Customization
Indicating primed state
You can use the callback sentence_case_primed()
to
indicated with an LED or otherwise that Sentence Case is “primed.” A
primed state means that a sentence ending was detected, and if the next
key typed is a letter, it will be capitalized. In your keymap.c, add
void sentence_case_primed(bool primed) {
// Change B0 to the pin for the LED to use.
(B0, primed);
writePin}
Knowing when Sentence Case is primed is useful feedback to use the feature effectively.
Idle timeout
Sentence Case may optionally be configured to reset its state if the
keyboard is idle for some time. This is useful to mitigate unintended
shifting when switching between typing and using the mouse. In your
config.h, define SENTENCE_CASE_TIMEOUT
with a time in
milliseconds:
#define SENTENCE_CASE_TIMEOUT 2000 // Reset state after 2 seconds.
and in your keymap.c, define (or add to)
matrix_scan_user()
as
void matrix_scan_user(void) {
();
sentence_case_task// Other tasks...
}
The default behavior (when SENTENCE_CASE_TIMEOUT
isn’t
set, or set to 0) is that Sentence Case never times out.
Functions
Functions to manipulate Sentence Case:
Function | Description |
---|---|
sentence_case_on() |
Enables Sentence Case. |
sentence_case_off() |
Disables Sentence Case. |
sentence_case_toggle() |
Toggles Sentence Case. |
is_sentence_case_on() |
Gets whether currently enabled. |
sentence_case_clear() |
Clears Sentence Case to initial state. |
These functions can be used to enable, disable, toggle, or clear Sentence Case from your keymap with a macro, combo, tap dance, or whatever means.
Defining exceptions
The sentence_case_check_ending()
callback is called when
a punctuating key is typed to decide whether it is a real sentence
ending, meaning the first letter of the following word should be
capitalized. For instance, abbreviations like “vs.” are usually not real
sentence endings. The input argument is a buffer of the last
SENTENCE_CASE_BUFFER_SIZE
keycodes (by default, the last 8
keycodes). Returning true means it is a real sentence ending and
automatic capitalization applies; returning false means it is not.
The default implementation checks for the abbreviations
“vs.
” and “etc.
”:
bool sentence_case_check_ending(const uint16_t* buffer) {
// Don't consider the abbreviations "vs." and "etc." to end the sentence.
if (SENTENCE_CASE_JUST_TYPED(KC_SPC, KC_V, KC_S, KC_DOT) ||
(KC_SPC, KC_E, KC_T, KC_C, KC_DOT)) {
SENTENCE_CASE_JUST_TYPEDreturn false; // Not a real sentence ending.
}
return true; // Real sentence ending; capitalize next letter.
}
Here, SENTENCE_CASE_JUST_TYPED()
is a helper macro that
checks whether the key buffer ends in a given keycode pattern. The
expression
(KC_SPC, KC_V, KC_S, KC_DOT) SENTENCE_CASE_JUST_TYPED
returns true if “space, v
, s
,
.
” were the last four keys typed.
Notes:
- Patterns must be at most
SENTENCE_CASE_BUFFER_SIZE
keys long, 8 by default. - Patterns are case-insensitive, modifiers are not considered in this check.
Add other abbreviations by adding them in the condition. For example
to check also for “misc.
”:
if (SENTENCE_CASE_JUST_TYPED(KC_SPC, KC_V, KC_S, KC_DOT) ||
(KC_SPC, KC_E, KC_T, KC_C, KC_DOT) ||
SENTENCE_CASE_JUST_TYPED(KC_SPC, KC_M, KC_I, KC_S, KC_C, KC_DOT)) {
SENTENCE_CASE_JUST_TYPEDreturn false; // Not a real sentence ending.
}
The buffer size SENTENCE_CASE_BUFFER_SIZE
can be changed
by defining it in config.h:
#define SENTENCE_CASE_BUFFER_SIZE 10
Setting SENTENCE_CASE_BUFFER_SIZE
to 0
disables the sentence ending check.
Letters and punctuations
The callback sentence_case_press_user()
defines which
keys are letters, punctuation, or something else. Defining this function
may be useful if you type non-US letters or have customized the shift
behavior of the punctuation keys.
The return value is a char code telling Sentence Case how to interpret the key:
Code | Description |
---|---|
'a'
|
Key is a letter, by default KC_A to KC_Z . If
occurring at the start of a sentence, Sentence Case applies shift to
capitalize it.
|
'.'
|
Key is sentence-ending punctuation. Default: KC_DOT , Shift
+ KC_SLSH (? ), Shift + KC_1
(! )
|
'#'
|
Key types a backspaceable character that isn’t part of a word. The
default includes KC_SLSH , Shift + KC_DOT
(> ), digits 0–9, KC_MINS ... KC_COMM , which
covers - = [ ] ; ’ ` , \, and KC_UNDS ... KC_COLN ,
which covers _ + { } | :
|
' '
|
Key is a space. Default: KC_SPC
|
'\''
|
Key types a quote or double quote character. Default:
KC_QUOT
|
'\0'
|
Sentence Case should ignore this key. |
If a hotkey or navigation key is pressed (or another key that
performs an action that backspace doesn’t undo), then the callback
should call sentence_case_clear()
to clear the state and
then return '\0'
.
The default sentence_case_press_user()
is:
char sentence_case_press_user(uint16_t keycode,
* record,
keyrecord_tuint8_t mods) {
if ((mods & ~(MOD_MASK_SHIFT | MOD_BIT(KC_RALT))) == 0) {
const bool shifted = mods & MOD_MASK_SHIFT;
switch (keycode) {
case KC_A ... KC_Z:
return 'a'; // Letter key.
case KC_DOT: // . is punctuation, Shift . is a symbol (>)
return !shifted ? '.' : '#';
case KC_1:
case KC_SLSH:
return shifted ? '.' : '#';
case KC_EXLM:
case KC_QUES:
return '.';
case KC_2 ... KC_0: // 2 3 4 5 6 7 8 9 0
case KC_AT ... KC_RPRN: // @ # $ % ^ & * ( )
case KC_MINS ... KC_SCLN: // - = [ ] backslash ;
case KC_UNDS ... KC_COLN: // _ + { } | :
case KC_GRV:
case KC_COMM:
return '#'; // Symbol key.
case KC_SPC:
return ' '; // Space key.
case KC_QUOT:
return '\''; // Quote key.
}
}
// Otherwise clear Sentence Case to initial state.
();
sentence_case_clearreturn '\0';
}
To customize, copy the above function into your keymap and add/remove keycodes to the above cases.
Explanation
We search for sentence beginnings using a simple finite state
machine. It matches things like “a. a
” and
“a. a
” but not “a.. a
” or
“a.a. a
”. The states are
State | Description |
---|---|
INIT |
Initial enabled state. |
WORD |
Within a word. |
ABBREV |
Within an abbreviation like “e.g.”. |
ENDING |
Sentence ended. |
PRIMED |
“Primed” state, in the space following an ending. |
Given the char code from sentence_case_press_user()
, the
state transition matrix is:
'a'
|
'.'
|
' '
|
|
INIT | WORD | INIT | INIT |
WORD | WORD | ENDING | INIT |
ABBREV | ABBREV | ABBREV | INIT |
ENDING | ABBREV | INIT | PRIMED |
PRIMED | match! | INIT | PRIMED |
When the char code is '#'
(symbol), the state is set to
INIT
. When the char code is '\''
(quote), the
state is unchanged. When the char code is '\0'
, Sentence
Case ignores the key.
When the state changes to
ENDING
, thesentence_case_check_ending()
callback is called. If it returns false, the state is set toINIT
.When the state changes to or from
PRIMED
, thesentence_case_primed()
callback is called.When a letter is typed during
PRIMED
state, one-shot Shift is set to capitalize the letter and the state change toWORD
.
Acknowledgements
Thanks to GitHub users @drashna, @EdenEast and Reddit user u/WandersFar for helpful feedback and contributions to Sentence Case.
Closing thoughts
It’s exciting what effects may be possible with features that track the context of what was just typed beyond the current key. Check out triggering based on previously typed keys for thoughts on how to implement such effects generally.