Wie entferne ich ï "¿ vom Anfang einer Datei?

https://stackoverflow.com/questions/3255993

16-09-2020
|

Frage

Ich habe eine CSS-Datei, die gut aussieht, wenn ich sie öffne mit gedit, aber wenn es von PHP gelesen wird (um alle CSS-Dateien zu einer zusammenzuführen), werden diesem CSS die folgenden Zeichen vorangestellt:ich»¿

PHP entfernt alle Leerzeichen, sodass ein zufälliges ï »¿ in der Mitte des Codes das Ganze durcheinander bringt.Wie bereits erwähnt, kann ich diese Zeichen beim Öffnen der Datei in gedit nicht wirklich sehen, daher kann ich sie nicht so einfach entfernen.

Ich habe das Problem gegoogelt und es stimmt eindeutig etwas nicht mit der Dateicodierung, was sinnvoll ist, da ich die Dateien über FTP und auf verschiedene Linux / Windows-Server verschoben habe rsync, mit einer Reihe von Texteditoren.Ich weiß nicht wirklich viel über Zeichenkodierung, also wäre Hilfe dankbar.

Wenn es hilft, wird die Datei im UTF-8-Format gespeichert, und gedit lässt mich sie nicht im ISO-8859-15-Format speichern (das Dokument enthält ein oder mehrere Zeichen, die mit der angegebenen Zeichencodierung nicht codiert werden können).Ich habe versucht, es mit Windows- und Linux-Zeilenenden zu speichern, aber beides hat nicht geholfen.

Lösung

Drei Worte für dich:

Byte-Reihenfolge-Markierung (BOM)

Das ist die Darstellung für die UTF-8-Stückliste in ISO-8859-1.Sie müssen Ihrem Redakteur mitteilen, dass er keine Stücklisten verwenden oder einen anderen Editor verwenden soll, um sie zu entfernen.

Um das Entfernen der Stückliste zu automatisieren, können Sie Folgendes verwenden awk wie gezeigt in diese Frage.

Als eine andere Antwort sagt, das Beste wäre für PHP, die Stückliste tatsächlich richtig zu interpretieren, dafür können Sie verwenden mb_internal_encoding(), so wie das:

 <?php
   //Storing the previous encoding in case you have some other piece 
   //of code sensitive to encoding and counting on the default value.      
   $previous_encoding = mb_internal_encoding();

   //Set the encoding to UTF-8, so when reading files it ignores the BOM       
   mb_internal_encoding('UTF-8');

   //Process the CSS files...

   //Finally, return to the previous encoding
   mb_internal_encoding($previous_encoding);

   //Rest of the code...
  ?>

Andere Tipps

Öffnen Sie Ihre Datei in Notepad ++ .Wählen Sie im Menü COCODING Konvertieren in UTF-8 ohne BOM , speichern Sie die Datei, ersetzen Sie die alte Datei durch diese neue Datei.Und es wird funktionieren, verdammt sicher.

in PHP , können Sie Folgendes tun, um alle Nicht-Zeichen einschließlich des betreffenden Zeichens zu entfernen.

generasacodicetagpre.

für diejenigen mit Shell-Zugriff Hier ist ein kleiner Befehl, um alle Dateien mit dem BOM-Set im Verzeichnis Public_HTML zu finden - Stellen Sie sicher, dass Sie ihn an das ändern, was Ihr richtiger Pfad auf Ihrem Server auf Ihrem Server ist

Code:

generasacodicetagpre.

und wenn Sie sich mit dem vi editor fühlen, öffnen Sie die Datei in VI:

generasacodicetagpre.

und geben Sie den Befehl ein, um die BOM zu entfernen:

generasacodicetagpre.

Speichern Sie die Datei:

generasacodicetagpre.

bom ist nur eine Reihenfolge von Zeichen ($ EF $ BB $ BF für UTF-8), also entfernen Sie sie einfach mithilfe von Skripts oder konfigurieren Sie den Editor, damit er nicht hinzugefügt wird.

von Entfernen von BOM von UTF-8 :

generasacodicetagpre.

Ich bin sicher, dass es leicht in PHP übersetzt.

Für mich funktionierte das:

generasacodicetagpre.

Wenn ich diese Meta entfernen, erscheint wieder der ï'-¿erscheint.Hoffe, das hilft jemandem ...

Ich kenne PHP nicht, also weiß ich nicht, ob dies möglich ist, aber die beste Lösung wäre, die Datei als UTF-8 anstelle von einer anderen Codierung zu lesen.Die BOM ist eigentlich eine Nullbreite, kein Bruchraum.Dies ist Whitespace. Wenn also die Datei in der richtigen Kodierung (UTF-8) gelesen wurde, würde die BOM als Whitespace interpretiert und in der resultierenden CSS-Datei ignoriert werden.

Ein weiterer Vorteil des Lesens der Datei in der richtigen Kodierung ist, dass Sie sich keine Sorge um Charaktere müssten, die nicht interpretiert werden müssen.Ihr Editor sagt Ihnen, dass die Code-Seite, in der Sie es speichern möchten, nicht alle Zeichen, die Sie benötigen.Wenn PHP dann die Datei in der falschen Codierung liest, ist es sehr wahrscheinlich, dass andere Zeichen außer dem BOM unbedingt falsch interpretiert werden.Verwenden Sie UTF-8 überall, und diese Probleme verschwinden.

Sie können verwenden

generasacodicetagpre.

Ersetzen mit awk scheint zu funktionieren, aber es ist nicht vorhanden.

grep -rl $'\xEF\xBB\xBF' * | xargs vim -e -c 'argdo set fileencoding=utf-8|set encoding=utf-8| set nobomb| wq'

I had the same problem with the BOM appearing in some of my PHP files (ï»¿ï»¿).

If you use PhpStorm you can set at hotkey to remove it in Settings -> IDE Settings -> Keymap -> Main Menu - > File -> Remove BOM.

In Notepad++, choose the "Encoding" menu, then "Encode in UTF-8 without BOM". Then save.

See Stack Overflow question How to make Notepad to save text in UTF-8 without BOM?.

Open the PHP file under question, in Notepad++.

Click on Encoding at the top and change from "Encoding in UTF-8 without BOM" to just "Encoding in UTF-8". Save and overwrite the file on your server.

Same problem, different solution.

One line in the PHP file was printing out XML headers (which use the same begin/end tags as PHP). Looks like the code within these tags set the encoding, and was executed within PHP which resulted in the strange characters. Either way here's the solution:

# Original
$xml_string = "&lt;?xml version=\"1.0\" encoding=\"UTF-8\"?&gt;";

# fixed
$xml_string = "<" . "?xml version=\"1.0\" encoding=\"UTF-8\"?" . ">";

If you need to be able to remove the BOM from UTF-8 encoded files, you first need to get hold of an editor that is aware of them.

I personally use E Text Editor.

In the bottom right, there are options for character encoding, including the BOM tag. Load your file, deselect Byte Order Marker if it is selected, resave, and it should be done.

Alt text http://oth4.com/encoding.png

E is not free, but there is a free trial, and it is an excellent editor (limited TextMate compatibility).

You can open it by PhpStorm and right-click on your file and click on Remove BOM...

Here is another good solution for the problem with BOM. These are two VBScript (.vbs) scripts.

One for finding the BOM in a file and one for KILLING the damned BOM in the file. It works pretty fine and is easy to use.

Just create a .vbs file, and paste the following code in it.

You can use the VBScript script simply by dragging and dropping the suspicious file onto the .vbs file. It will tell you if there is a BOM or not.

' Heiko Jendreck - personal helpdesk & webdesign
' http://www.phw-jendreck.de
' 2010.05.10 Vers 1.0
'
' find_BOM.vbs
' ====================
' Kleines Hilfsmittel, welches das BOM finden soll
'
 Const UTF8_BOM = "ï»¿"
 Const UTF16BE_BOM = "þÿ"
 Const UTF16LE_BOM = "ÿþ"
 Const ForReading = 1
 Const ForWriting = 2
 Dim fso
 Set fso = WScript.CreateObject("Scripting.FileSystemObject")
 Dim f
 f = WScript.Arguments.Item(0)
 Dim t
 t = fso.OpenTextFile(f, ForReading).ReadAll
 If Left(t, 3) = UTF8_BOM Then
     MsgBox "UTF-8-BOM detected!"
 ElseIf Left(t, 2) = UTF16BE_BOM Then
     MsgBox "UTF-16-BOM (Big Endian) detected!"
 ElseIf Left(t, 2) = UTF16LE_BOM Then
     MsgBox "UTF-16-BOM (Little Endian) detected!"
 Else
     MsgBox "No BOM detected!"
 End If

If it tells you there is BOM, go and create the second .vbs file with the following code and drag the suspicios file onto the .vbs file.

' Heiko Jendreck - personal helpdesk & webdesign
' http://www.phw-jendreck.de
' 2010.05.10 Vers 1.0
'
' kill_BOM.vbs
' ====================
' Kleines Hilfmittel, welches das gefundene BOM löschen soll
'
Const UTF8_BOM = "ï»¿"
Const ForReading = 1
Const ForWriting = 2
Dim fso
Set fso = WScript.CreateObject("Scripting.FileSystemObject")
Dim f
f = WScript.Arguments.Item(0)
Dim t
t = fso.OpenTextFile(f, ForReading).ReadAll
If Left(t, 3) = UTF8_BOM Then
    fso.OpenTextFile(f, ForWriting).Write (Mid(t, 4))
    MsgBox "BOM gelöscht!"
Else
    MsgBox "Kein UTF-8-BOM vorhanden!"
End If

The code is from Heiko Jendreck.

In PHPStorm, for multiple files and BOM not necessarily at the beginning of the file, you can search \x{FEFF} (Regular Expression) and replace with nothing.

Same problem, but it only affected one file so I just created a blank file, copy/pasted the code from the original file to the new file, and then replaced the original file. Not fancy but it worked.

Use Total Commander to search for all BOMed files:

Elegant way to search for UTF-8 files with BOM?

Open these files in some proper editor (that recognizes BOM) like Eclipse.
Change the file's encoding to ISO (right click, properties).
Cut ï»¿ from the beginning of the file, save
Change the file's encoding back to UTF-8

...and do not even think about using n...d again!

I had the same problem. The problem was because one of my php files was in utf-8 (the most important, the configuaration file which is included in all php files).

In my case, I had 2 different solutions which worked for me :

First, I changed the Apache Configuration by using AddDefaultCharsetDirective in configuration files (or in .htaccess). This solution forces Apache to use the correct encodage.

AddDefaultCharset ISO-8859-1

The second solution was to change the bad encoding of the php file.

Copy the text of your filename.css file.
Close your css file.
Rename it filename2.css to avoid a filename clash.
In MS Notepad or Wordpad, create a new file.
Paste the text into it.
Save it as filename.css, selecting UTF-8 from the encoding options.
Upload filename.css.

Check on your index.php, find "... charset=iso-8859-1" and replace it with "... charset=utf-8".

Maybe it'll work.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow