Alister West

home is where your code is ...

Find files that contain unicode characters in comments

Unicode in comments are usually copy/paste issues. Unfortunately bash + screen + vim doesn't play nicely.

# -n joins all @ARGV into one big file and feeds in the lines one by one.

perl -nE '
    # skip files likely to containt wide-chars'
    exit if $ARGV =~ /(png|gif|js)$/;

    # reset line counter (as -n won't reset $.)
    $. = 1 if $ARGV ne $oldfile;
    $oldfile = $ARGV;

    # Check all chars on a line if they are outside ascii.
    for (split //, $_) {
        say "$ARGV:$.> $_:" . ord($_) if ord($_) > 128
    }
' $(find . -type f)

# and all on one line'
perl -nE 'exit if $ARGV =~ /(png|gif|js)$/; $.=1 if $ARGV ne $old;\
$old=$ARGV; for(split//,{ say "$ARGV:$.> $_:". ord $_ if ord $_ > 128 }\
' $(find . -type f)
By Alister West