C ++ 파일 IO 및 분리기 별 분할

https://stackoverflow.com/questions/267427

06-07-2019
|

문제

다음과 같이 데이터가 나열된 파일이 있습니다.

0,       2,    10
10,       8,    10
10,       10,   10
10,       16,   10
15,       10,   16
17,       10,   16

모든 초과 공간을 트리밍하고 각 요소를 정수로 변환하는 프로세스에서 파일을 입력하고 3 개의 배열로 나눌 수 있기를 원합니다.

어떤 이유로 C ++에서 쉽게 수행 할 수있는 방법을 찾을 수 없습니다. 내가 가진 유일한 성공은 각 라인을 배열에 입력 한 다음 모든 공간을 다시 배치 한 다음 분할하는 것입니다. 이 전체 프로세스는 20-30 줄의 코드 라인과 다른 분리기 (예 : 공간) 등을 수정해야 할 고통이 필요했습니다.

이것은 내가 C ++에서 갖고 싶은 것과 동등한 파이썬입니다.

f = open('input_hard.dat')
lines =  f.readlines()
f.close()

#declarations
inint, inbase, outbase = [], [], []

#input parsing
for line in lines:
    bits = string.split(line, ',')
    inint.append(int(bits[0].strip()))
    inbase.append(int(bits[1].strip()))
    outbase.append(int(bits[2].strip()))

파이썬 에서이 작업을 쉽게 수행하는 것은 내가 처음에 옮긴 이유 중 하나입니다. 그러나 지금 C ++ 에서이 작업을 수행해야하며 추악한 20-30 라인 코드를 사용해야합니다.

도움이 될 것입니다. 감사합니다!

해결책

FSCANF에는 아무런 문제가 없으며,이 경우 가장 빠른 솔루션 일 것입니다. 그리고 파이썬 코드만큼 짧고 읽을 수 있습니다.

FILE *fp = fopen("file.dat", "r");
int x, y, z;
std::vector<int> vx, vy, vz;

while (fscanf(fp, "%d, %d, %d", &x, &y, &z) == 3) {
  vx.push_back(x);
  vy.push_back(y);
  vz.push_back(z);
}
fclose(fp);

다른 팁

스트림이 트릭을 멋지게 수행 하므로이 예에서는 부스트를 사용할 필요가 없습니다.

int main(int argc, char* argv[])
{
    ifstream file(argv[1]);

    const unsigned maxIgnore = 10;
    const int delim = ',';
    int x,y,z;

    vector<int> vecx, vecy, vecz;

    while (file)
    {
        file >> x;
        file.ignore(maxIgnore, delim);
        file >> y;
        file.ignore(maxIgnore, delim);
        file >> z;

        vecx.push_back(x);
        vecy.push_back(y);
        vecz.push_back(z);
    }
}

내가 부스트를 사용하려고한다면 나는 단순성을 선호한다. 토큰 화기 재규모에 ... :)

같은 것 :

vector<int> inint;
vector<int> inbase;
vector<int> outbase;
while (fgets(buf, fh)) {
   char *tok = strtok(buf, ", ");
   inint.push_back(atoi(tok));
   tok = strtok(NULL, ", ");
   inbase.push_back(atoi(tok));
   tok = strtok(NULL, ", ");
   outbase.push_back(atoi(tok));
}

오류 확인을 제외하고.

std :: getline은 텍스트 줄을 읽을 수 있으며 문자열 스트림을 사용하여 개별 선을 구문 분석 할 수 있습니다.

string buf;
getline(cin, buf); 
stringstream par(buf);

char buf2[512];
par.getline(buf2, 512, ','); /* Reads until the first token. */

텍스트 줄을 문자열에 넣으면 실제로 원하는 구문 분석 기능, 심지어 sscanf (buf.c_str (), "%d,%d '%d", & i1, & i2, & i3)조차도 사용하여 사용하여 사용합니다. 정수와의 기판 또는 다른 방법을 통해 Atoi.

입력 스트림에서 원치 않는 문자를 무시할 수도 있습니다.

if (cin.peek() == ',')
    cin.ignore(1, ',');
cin >> nextInt;

부스트 라이브러리를 사용하지 않는다면 ...

#include <string>
#include <vector>
#include <boost/lexical_cast.hpp>
#include <boost/regex.hpp>

std::vector<int> ParseFile(std::istream& in) {
    const boost::regex cItemPattern(" *([0-9]+),?");
    std::vector<int> return_value;

    std::string line;
    while (std::getline(in, line)) {
        string::const_iterator b=line.begin(), e=line.end();
        boost::smatch match;
        while (b!=e && boost::regex_search(b, e, match, cItemPattern)) {
            return_value.push_back(boost::lexical_cast<int>(match[1].str()));
            b=match[0].second;
        };
    };

    return return_value;
}

스트림에서 선을 끌어 당긴 다음 Boost :: Regex 라이브러리 (캡처 그룹 포함)를 사용하여 라인에서 각 숫자를 추출합니다. 유효한 숫자가 아닌 것을 자동으로 무시하지만 원하는 경우 변경할 수 있습니다.

여전히 약 20 줄입니다 #includes, 그러나 당신은 그것을 사용하여 본질적으로 추출 할 수 있습니다. 아무것 파일의 줄에서. 이것은 사소한 예입니다. 데이터베이스 필드에서 태그와 선택적 값을 추출하기 위해 거의 동일한 코드를 사용하고 있습니다. 유일한 주요 차이점은 정규 표현식입니다.

편집 : 죄송합니다. 3 개의 개별 벡터를 원했습니다. 대신이 약간의 수정을 시도하십시오.

const boost::regex cItemPattern(" *([0-9]+), *([0-9]+), *([0-9]+)");
std::vector<int> vector1, vector2, vector3;

std::string line;
while (std::getline(in, line)) {
    string::const_iterator b=line.begin(), e=line.end();
    boost::smatch match;
    while (b!=e && boost::regex_search(b, e, match, cItemPattern)) {
        vector1.push_back(boost::lexical_cast<int>(match[1].str()));
        vector2.push_back(boost::lexical_cast<int>(match[2].str()));
        vector3.push_back(boost::lexical_cast<int>(match[3].str()));
        b=match[0].second;
    };
};

왜 Python과 같은 코드가 아닌가 :)?

std::ifstream file("input_hard.dat");
std::vector<int> inint, inbase, outbase;

while (file.good()){
    int val1, val2, val3;
    char delim;
    file >> val1 >> delim >> val2 >> delim >> val3;

    inint.push_back(val1);
    inbase.push_back(val2);
    outbase.push_back(val3);
}

더 어려운 입력 형식으로 확장하려면 Spirit, Parser Combinator 라이브러리를 부스트해야합니다.

이 페이지 필요한 것을 거의 수행하는 예가 있습니다 (실제 및 하나의 벡터와 함께)

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow