I have created the following regular expression for converting copied chess24.com games into PGN-compatible games:
\s*(\d{1,3})\.?\s*((?:(?:O-O(?:-O)?)|(?:[KQNBR][1-8a-h]?x?[a-h]x?[1-8])|(?:[a-h]x?[a-h]?[1-8]\=?[QRNB]?))\+?)(?:\s*\d+\.?\d+?m?s)?\.?\s*((?:(?:O-O(?:-O)?)|(?:[KQNBR][1-8a-h]?x?[a-h]x?[1-8])|(?:[a-h]x?[a-h]?[1-8]\=?[QRNB]?))\+?)?(?:\s*\d+\.?\d+?m?s)?
with the replacement field being
\1. \2 \3\n
Or
$1. $2 $3\n
depending on your regex environment or regex engine.
Verbose regex in Python:
chess_pattern = re.compile(r"""
\s* # Whitespace
(\d{1,3}) # Capture group 1: Move number between 1 and 999 will precede white side's move.
\.? # Literal period, in case move numbers followed by a period. The replace pattern will restore period, so it is not captured.
\s* # Whitespace
( # Capture group 2: This will collect the white side's move
(?: # Start non-capturing group A: Use vertical bar | between non-capturing groups to check for castling, piece moves/captures, pawn moves/captures/promotion
(?:O-O(?:-O)?) # Non-capturing subgroup A1: For castling kingside or queenside. Change the O to 0 to work for sites that 0-0 for castling notation
|(?:[KQNBR][1-8a-h]?x?[a-h]x?[1-8]) # Non-capturing subgroup A2: For piece (non-pawn) moves and piece captures
|(?:[a-h]x?[a-h]?[1-8]\=?[QRNB]?) # Non-capturing subgroup A3: Pawn moves, captures, and promotions
) # End non-capturing group A
\+? # Allow plus symbol for checks (attacks on king)
) # End capturing group 2: White side's move
(?:\s*\d+\.?\d+?m?s)? # Non-capturing group B: Skip over move-times; it is possible to retain move times if you make this a capturing group
\.? # Allow period in case a time ends in a decimal point
\s* # Whitespace
( # Capture group 3: This will collect the black side's move
(?: # Start non-capturing group C: Use vertical bar | between non-capturing groups to check for castling, piece moves/captures, pawn moves/captures/promotion
(?:O-O(?:-O)?) # Non-capturing subgroup C1: For castling kingside or queenside. Change the O to 0 to work for sites that 0-0 for castling notation
|(?:[KQNBR][1-8a-h]?x?[a-h]x?[1-8]) # Non-capturing subgroup C2: For piece (non-pawn) moves and piece captures
|(?:[a-h]x?[a-h]?[1-8]\=?[QRNB]?) # Non-capturing subgroup C3: Pawn moves, captures, and promotions
) # End non-capturing group C
\+? # Allow plus symbol for checks (attacks on king)
)? # End capturing group 3: Black side's move. Question mark allows final move to be white side's move without any subsequent black moves
(?:\s*\d+\.?\d+?m?s)? # Non-capturing group D: Skip over move-times; it is possible to retain move times if you make this a capturing group
""",re.VERBOSE)
# Paste the entire chess game inside the raw string below where there is currently ...
chess_game = """
...
"""
print( pattern.sub(r'\1. \2 \3 '+'\n',chess_game) ) # Will output PGN to console
# The following writes the PGN to a file `game.pgn` in the working directory
output_PGN = open('game.pgn','w+')
output_PGN.write(pattern.sub(r'\1. \2 \3 '+'\n',chess_game))
output_PGN.close()
See here for an example of this in action: regexr.com/58ngb
I have also implemented the above as a Clipboard Fusion (C#) macro here: https://www.clipboardfusion.com/Macros/View/?ID=d220984d-faa4-4ba2-ab86-f16dceb42036