I've seen reference to using String.replaceAll("",""); as some means for eliminating "blank" or "nonprinting chars" from a string in Java. This is false as the answer will demonstrate.

value = value.replaceAll("", "");

有帮助吗?

解决方案

Using Junit, I test every single Unicode character, from \u0000 to \uffff.

@Test
public void testReplaceBlanks(){
char input = 0;
char escape = '\u0000';

for(char i = 0; i <= 65535; ++i){
    input = (char) (escape + i);
    System.out.print(input);
    System.out.print(" ");
    if( i % 80 == 0){
        System.out.println();
    }

    String test = new String(Character.toString(input));
    assertTrue(!"".equals(test.replaceAll("", "")));

    if(i == 65535)
        break;
}
}

I don't find a single instance where that line of code does anything useful.

Since I've spotted this issue a couple more times on the internet, here's a more robust test case:

Major problem here, this line of code is a NO-OP.

value = value.replaceAll(“”, “”);

Consider the following test case:

  public static void println(String s) {
    System.out.println(s);
  }

  @Test
  public void testNullStripWithEmptyString() {
    String input = "foo" + '\0';
    String input2 = "foo";
    println(input);
    println("input:");
    printBytes(input.getBytes());
    println("input2:");
    printBytes(input2.getBytes());
    String testValue = input.replaceAll("", "");
    println("testValue:");
    printBytes(testValue.getBytes());
    String testvalue2 = input2.replaceAll("","");
    println("testvalue2");
    printBytes(testvalue2.getBytes());
    assertFalse(input.equals(input2));
    assertFalse(testValue.equals(testvalue2));
  }

This test case demonstrates first, that in the byte representations of the two input strings, that the null byte appears in the first, but not in the second. We then proceed to call *.replaceAll(“”,””); and store the values into two new variables, testValue and testvalue2.

This then leads to the first assert, which asserts that the two values should not be equal calling the normal String equals method. This is trivally true, because we DO have a nonprinting null byte appended to the string. However, the nail in the coffin is in demonstrating that this condition still holds after calling *.replaceAll(“”,””); on the two testValue strings.

The only way to prevent non-printing or NULL bytes would be to implement the following test case:

  @Test 
  public void testNullStripWithNullUnicodeEscape(){
    String input = "foo" + '\0';
    String input2 = "foo";
    println(input);
    println("input:");
    printBytes(input.getBytes());
    println("input2:");
    printBytes(input2.getBytes());
    String testValue = input.replaceAll("\u0000", "");
    println("testValue:");
    printBytes(testValue.getBytes());
    String testvalue2 = input2.replaceAll("\u0000","");
    println("testvalue2");
    printBytes(testvalue2.getBytes());
    assertFalse(input.equals(input2));
    assertTrue(testValue.equals(testvalue2));
  }
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top