Replace Non-breaking Space UTF-8 (C2 A0) | Notepad++ Community

Community Search
  • Login
Replace non-breaking space UTF-8 (C2 A0) Scheduled Pinned Locked Moved General Discussion 4 10 19.0k Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply This topic has been deleted. Only users with topic management privileges can see it.
  • Bart HeinsiusB last edited by

    Hi,

    When I paste my programming code into gmail it replaces the spaces with non-breaking spaces. When I copy/paste the programming code from gmail to notepad++ i see the non-breaking spaces as C2 A0 in hex view mode. In hex view mode i can replace C2 A0 with 20 (space).

    How can I replace the non-breaking spaces (C2 A0) without going to hex view mode first? I tried “\xC2\xA0” in Regular expression search mode but that does not work, it says it can’t find the text “\xC2\xA0”.

    Thanks, Bart

    Alan KilbornA 1 Reply Last reply Reply Quote 0
  • mkupperM last edited by

    @Bart-Heinsius said in Replace non-breaking space UTF-8 (C2 A0):

    \xC2\xA0

    Search for \x{A0} using regular expression mode. You can use the \x{xxxx} style for any 16-but Unicode character. It does not work for characters beyond \x{ffff} such as the newer emoticons.

    Bart HeinsiusB 1 Reply Last reply Reply Quote 2
  • Alan KilbornA @Bart Heinsius last edited by

    @Bart-Heinsius

    Hex editing is a “low-level” operation (encoding-unaware). Regular expression operations function at a higher-level (encoding-aware). Note: Encoding turns special strings of low-level bytes into different values of higher-level characters. (That’s the short version).

    The best way to perform the operation is to get the data you want to replace into the Find what box. You can do this by selecting the character and then pressing Ctrl+f. Replace it with anything you want.

    Bart HeinsiusB 1 Reply Last reply Reply Quote 2
  • Bart HeinsiusB @mkupper last edited by

    @mkupper Searching for \x{A0} finds all my newlines too, I just want the non-breaking spaces (and replace them all with normal spaces).

    Searching for \x{C2A0} does not find the non-breaking spaces.

    mkupperM 1 Reply Last reply Reply Quote 0
  • Bart HeinsiusB @Alan Kilborn last edited by

    @Alan-Kilborn I selected a non-breaking space in the document, copied it into the Find and replace dialog, to replace it with a normal space, but the non-breaking spaces are still in there.

    Alan KilbornA 1 Reply Last reply Reply Quote 0
  • mkupperM @Bart Heinsius last edited by

    @Bart-Heinsius - What language or file extension are you using on the notepad++ side? I tested using plain .txt but you mentioned “programming code” being copied to/from gmail.

    What encoding are you using? Click the Encoding menu option to see what’s currently selected. I would expect UTF-8 or UTF-8 BOM.

    Can you show the full expression you are using to search? There is a . matches newline option on the search/replace box but \x{A0} by itself should not be matching newlines and does not for me. Likewise, the suggestion to copy/paste from one of the non-breaking spaces you know of into the search box should not result in matching with newlines.

    1 Reply Last reply Reply Quote 1
  • Alan KilbornA @Bart Heinsius last edited by

    @Bart-Heinsius said in Replace non-breaking space UTF-8 (C2 A0):

    I selected a non-breaking space in the document, copied it into the Find and replace dialog, to replace it with a normal space, but the non-breaking spaces are still in there.

    Worked for me to replace non-breaking spaces. Don’t know what’s truly going on with your situation.

    1 Reply Last reply Reply Quote 1
  • PeterJonesP last edited by PeterJones

    @Bart-Heinsius ,

    To help us understand what’s going on for you, could you share a screenshot similar to the following:

    I created a file with non-breaking spaces between the words (so there were two of them), and did a Count, which correctly showed the two. My screenshot also includes the current encoding in the status bar: 019ee34e-af32-4124-b5cf-fab33c3e87ed-image.png

    (edit: in case I wasn’t clear, we want the screenshot of your real file, not of a brand new file created to look like mine)

    Another thing to do would be for a small snippet from your file, which contains the non-breaking space, select the text, use Plugins > MIME Tools > Base64 Encode, and paste the results here, using the forum’s </> button to make sure it comes through as real text:

    bm9uwqBicmVha2luZ8Kgc3BhY2U

    With that, we can Base64 Decode and have the same bytes as you for the experiment.

    Anything you can do to help us replicate your circumstances will help us help you debug the problem.

    (Also, grab ? menu, Debug Info and paste in your reply)

    Bart HeinsiusB 1 Reply Last reply Reply Quote 3
  • Bart HeinsiusB @PeterJones last edited by

    @PeterJones I’m sorry, i was mistaking the A0 with newlines (0A). I thought I needed to replace the C2A0 that I see in hex view. But replacing just A0 by a space also gets rid of the C2. Thanks for your time and support.

    PeterJonesP 1 Reply Last reply Reply Quote 2
  • PeterJonesP @Bart Heinsius last edited by PeterJones

    @Bart-Heinsius ,

    Glad it’s working for you now.

    In case you are curious, 0xC2 0xA0 is the two byte sequence in UTF-8 that represents the single character at U+00A0.

    • https://en.wikipedia.org/wiki/UTF-8
    • http://www.fileformat.info/info/unicode/char/00a0/index.htm

    As @Alan-Kilborn said, the search-and-replace feature works with the actual character codepoints, hence uses \x{A0} or \x{00A0} to match that character, whereas the Hex Editor works with the individual bytes, so shows you both the byte 0xC2 and the byte 0xA0.

    Like the hex editor, The MIME Tools plugin uses the raw bytes, rather than the characters. You can easily see this in my example, which using MIME Tools > URL Encode shows non%C2%A0breaking%C2%A0space, so you can see %C2 and %A0 are used to encode the bytes 0xC2 0xA0 which are used to encode the character at U+00A0.

    1 Reply Last reply Reply Quote 2
  • First post Last post Go to my next post

Looks like your connection to Community was lost, please wait while we try to reconnect.

Từ khóa » C2 A0 Vs 20