c++ - resueltos - ¿Cómo itero sobre las palabras de una cadena?

funciones de cadenas de caracteres en c++ (30)

Aquí hay otra solución. Es compacto y razonablemente eficiente:

std::vector<std::string> split(const std::string &text, char sep) { std::vector<std::string> tokens; std::size_t start = 0, end = 0; while ((end = text.find(sep, start)) != std::string::npos) { tokens.push_back(text.substr(start, end - start)); start = end + 1; } tokens.push_back(text.substr(start)); return tokens; }

Se puede templar fácilmente para manejar separadores de cuerdas, cuerdas anchas, etc.

Tenga en cuenta que la división "" da como resultado una única cadena vacía y la división "," (es decir, sep) da como resultado dos cadenas vacías.

También se puede expandir fácilmente para omitir fichas vacías:

std::vector<std::string> split(const std::string &text, char sep) { std::vector<std::string> tokens; std::size_t start = 0, end = 0; while ((end = text.find(sep, start)) != std::string::npos) { if (end != start) { tokens.push_back(text.substr(start, end - start)); } start = end + 1; } if (end != start) { tokens.push_back(text.substr(start)); } return tokens; }

Si se desea dividir una cadena en múltiples delimitadores mientras se omiten tokens vacíos, se puede usar esta versión:

std::vector<std::string> split(const std::string& text, const std::string& delims) { std::vector<std::string> tokens; std::size_t start = text.find_first_not_of(delims), end = 0; while((end = text.find_first_of(delims, start)) != std::string::npos) { tokens.push_back(text.substr(start, end - start)); start = text.find_first_not_of(delims, end); } if(start != std::string::npos) tokens.push_back(text.substr(start)); return tokens; }

Estoy tratando de iterar sobre las palabras de una cuerda.

Se puede suponer que la cadena está compuesta de palabras separadas por espacios en blanco.

Tenga en cuenta que no estoy interesado en las funciones de cadena C o en ese tipo de manipulación / acceso de caracteres. Además, por favor, dé prioridad a la elegancia sobre la eficiencia en su respuesta.

La mejor solución que tengo ahora es:

#include <iostream> #include <sstream> #include <string> using namespace std; int main() { string s = "Somewhere down the road"; istringstream iss(s); do { string subs; iss >> subs; cout << "Substring: " << subs << endl; } while (iss); }

¿Hay una manera más elegante de hacer esto?

Aquí hay una función dividida que:

es genérico
utiliza C ++ estándar (sin impulso)
acepta múltiples delimitadores
ignora las fichas vacías (se pueden cambiar fácilmente)
template<typename T> vector<T> split(const T & str, const T & delimiters) { vector<T> v; typename T::size_type start = 0; auto pos = str.find_first_of(delimiters, start); while(pos != T::npos) { if(pos != start) // ignore empty tokens v.emplace_back(str, start, pos - start); start = pos + 1; pos = str.find_first_of(delimiters, start); } if(start < str.length()) // ignore trailing delimiter v.emplace_back(str, start, str.length() - start); // add what''s left of the string return v; }

Ejemplo de uso:

vector<string> v = split<string>("Hello, there; World", ";,"); vector<wstring> v = split<wstring>(L"Hello, there; World", L";,");

Aquí hay una solución de expresiones regulares que solo usa la biblioteca de expresiones regulares estándar. (Estoy un poco oxidado, por lo que puede haber algunos errores de sintaxis, pero esta es al menos la idea general)

#include <regex.h> #include <string.h> #include <vector.h> using namespace std; vector<string> split(string s){ regex r ("//w+"); //regex matches whole words, (greedy, so no fragment words) regex_iterator<string::iterator> rit ( s.begin(), s.end(), r ); regex_iterator<string::iterator> rend; //iterators to iterate thru words vector<string> result<regex_iterator>(rit, rend); return result; //iterates through the matches to fill the vector }

Aquí hay una solución simple que usa solo la biblioteca estándar de expresiones regulares

#include <regex> #include <string> #include <vector> std::vector<string> Tokenize( const string str, const std::regex regex ) { using namespace std; std::vector<string> result; sregex_token_iterator it( str.begin(), str.end(), regex, -1 ); sregex_token_iterator reg_end; for ( ; it != reg_end; ++it ) { if ( !it->str().empty() ) //token could be empty:check result.emplace_back( it->str() ); } return result; }

El argumento regex permite verificar múltiples argumentos (espacios, comas, etc.)

Por lo general, solo selecciono para dividir en espacios y comas, por lo que también tengo esta función predeterminada:

std::vector<string> TokenizeDefault( const string str ) { using namespace std; regex re( "[//s,]+" ); return Tokenize( str, re ); }

El "[//s,]+" verifica espacios ( //s ) y comas ( , ).

Tenga en cuenta, si desea dividir wstring lugar de string ,

cambiar todos std::regex a std::wregex
cambia todo sregex_token_iterator a wsregex_token_iterator

Tenga en cuenta que es posible que también desee tomar el argumento de cadena por referencia, dependiendo de su compilador.

El STL no tiene un método disponible ya.

Sin embargo, puede usar la strtok() std::string::c_str() C usando el miembro std::string::c_str() , o puede escribir el suyo propio. Aquí hay un ejemplo de código que encontré después de una búsqueda rápida en Google ( "división de cadena STL" ):

void Tokenize(const string& str, vector<string>& tokens, const string& delimiters = " ") { // Skip delimiters at beginning. string::size_type lastPos = str.find_first_not_of(delimiters, 0); // Find first "non-delimiter". string::size_type pos = str.find_first_of(delimiters, lastPos); while (string::npos != pos || string::npos != lastPos) { // Found a token, add it to the vector. tokens.push_back(str.substr(lastPos, pos - lastPos)); // Skip delimiters. Note the "not_of" lastPos = str.find_first_not_of(delimiters, pos); // Find next "non-delimiter" pos = str.find_first_of(delimiters, lastPos); } }

Tomado de: http://oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html

Si tiene preguntas sobre el ejemplo de código, deje un comentario y se lo explicaré.

Y solo porque no implementa un typedef llamado iterador o sobrecarga, el operador << no significa que sea un código incorrecto. Yo uso las funciones de C con bastante frecuencia. Por ejemplo, printf y scanf son más rápidos que std::cin y std::cout (significativamente), la sintaxis de fopen es mucho más amigable para los tipos binarios, y también tienden a producir EXEs más pequeños.

No se venda en este acuerdo de "Elegancia sobre rendimiento" .

Esta es mi forma favorita de iterar a través de una cadena. Puedes hacer lo que quieras por palabra.

string line = "a line of text to iterate through"; string word; istringstream iss(line, istringstream::in); while( iss >> word ) { // Do something on `word` here... }

Esto es similar a la pregunta de desbordamiento de pila ¿ Cómo tokenize una cadena en C ++? .

#include <iostream> #include <string> #include <boost/tokenizer.hpp> using namespace std; using namespace boost; int main(int argc, char** argv) { string text = "token test/tstring"; char_separator<char> sep(" /t"); tokenizer<char_separator<char>> tokens(text, sep); for (const string& t : tokens) { cout << t << "." << endl; } }

Hasta ahora utilicé el de Boost , pero necesitaba algo que no dependiera de él, así que llegué a esto:

static void Split(std::vector<std::string>& lst, const std::string& input, const std::string& separators, bool remove_empty = true) { std::ostringstream word; for (size_t n = 0; n < input.size(); ++n) { if (std::string::npos == separators.find(input[n])) word << input[n]; else { if (!word.str().empty() || !remove_empty) lst.push_back(word.str()); word.str(""); } } if (!word.str().empty() || !remove_empty) lst.push_back(word.str()); }

Un buen punto es que en los separators puedes pasar más de un carácter.

Hay una función llamada strtok .

#include<string> using namespace std; vector<string> split(char* str,const char* delim) { char* saveptr; char* token = strtok_r(str,delim,&saveptr); vector<string> result; while(token != NULL) { result.push_back(token); token = strtok_r(NULL,delim,&saveptr); } return result; }

Hice rodar mi propio uso de strtok y utilicé boost para dividir una cadena. El mejor método que he encontrado es la biblioteca de C ++ String Toolkit . Es increíblemente flexible y rápido.

#include <iostream> #include <vector> #include <string> #include <strtk.hpp> const char *whitespace = " /t/r/n/f"; const char *whitespace_and_punctuation = " /t/r/n/f;,="; int main() { { // normal parsing of a string into a vector of strings std::string s("Somewhere down the road"); std::vector<std::string> result; if( strtk::parse( s, whitespace, result ) ) { for(size_t i = 0; i < result.size(); ++i ) std::cout << result[i] << std::endl; } } { // parsing a string into a vector of floats with other separators // besides spaces std::string s("3.0, 3.14; 4.0"); std::vector<float> values; if( strtk::parse( s, whitespace_and_punctuation, values ) ) { for(size_t i = 0; i < values.size(); ++i ) std::cout << values[i] << std::endl; } } { // parsing a string into specific variables std::string s("angle = 45; radius = 9.9"); std::string w1, w2; float v1, v2; if( strtk::parse( s, whitespace_and_punctuation, w1, v1, w2, v2) ) { std::cout << "word " << w1 << ", value " << v1 << std::endl; std::cout << "word " << w2 << ", value " << v2 << std::endl; } } return 0; }

El kit de herramientas tiene mucha más flexibilidad de lo que muestra este simple ejemplo, pero su utilidad para analizar una cadena en elementos útiles es increíble.

La stringstream puede ser conveniente si necesita analizar la cadena con símbolos que no son espacios:

string s = "Name:JAck; Spouse:Susan; ..."; string dummy, name, spouse; istringstream iss(s); getline(iss, dummy, '':''); getline(iss, name, '';''); getline(iss, dummy, '':''); getline(iss, spouse, '';'')

Me gusta lo siguiente porque pone los resultados en un vector, admite una cadena como delimitación y da control para mantener los valores vacíos. Pero, no se ve tan bien entonces.

#include <ostream> #include <string> #include <vector> #include <algorithm> #include <iterator> using namespace std; vector<string> split(const string& s, const string& delim, const bool keep_empty = true) { vector<string> result; if (delim.empty()) { result.push_back(s); return result; } string::const_iterator substart = s.begin(), subend; while (true) { subend = search(substart, s.end(), delim.begin(), delim.end()); string temp(substart, subend); if (keep_empty || !temp.empty()) { result.push_back(temp); } if (subend == s.end()) { break; } substart = subend + delim.size(); } return result; } int main() { const vector<string> words = split("So close no matter how far", " "); copy(words.begin(), words.end(), ostream_iterator<string>(cout, "/n")); }

Por supuesto, Boost tiene un split() que funciona parcialmente así. Y, si por ''espacio en blanco'', realmente quiere decir cualquier tipo de espacio en blanco, usar la división de Boost con is_any_of() funciona muy bien.

Otra forma más flexible y rápida.

template<typename Operator> void tokenize(Operator& op, const char* input, const char* delimiters) { const char* s = input; const char* e = s; while (*e != 0) { e = s; while (*e != 0 && strchr(delimiters, *e) == 0) ++e; if (e - s > 0) { op(s, e - s); } s = e + 1; } }

Para usarlo con un vector de cadenas (Editar: Dado que alguien señaló que no hereda las clases STL ... hrmf;)):

template<class ContainerType> class Appender { public: Appender(ContainerType& container) : container_(container) {;} void operator() (const char* s, unsigned length) { container_.push_back(std::string(s,length)); } private: ContainerType& container_; }; std::vector<std::string> strVector; Appender v(strVector); tokenize(v, "A number of words to be tokenized", " /t");

¡Eso es! Y esa es solo una forma de usar el tokenizador, por ejemplo, cómo contar palabras:

class WordCounter { public: WordCounter() : noOfWords(0) {} void operator() (const char*, unsigned) { ++noOfWords; } unsigned noOfWords; }; WordCounter wc; tokenize(wc, "A number of words to be counted", " /t"); ASSERT( wc.noOfWords == 7 );

Limitado por la imaginación;)

Para aquellos con quienes no se siente bien sacrificar toda la eficiencia por el tamaño del código y ver "eficiente" como un tipo de elegancia, lo siguiente debería ser un punto dulce (y creo que la clase de contenedor de plantillas es una adición increíblemente elegante):

template < class ContainerT > void tokenize(const std::string& str, ContainerT& tokens, const std::string& delimiters = " ", bool trimEmpty = false) { std::string::size_type pos, lastPos = 0, length = str.length(); using value_type = typename ContainerT::value_type; using size_type = typename ContainerT::size_type; while(lastPos < length + 1) { pos = str.find_first_of(delimiters, lastPos); if(pos == std::string::npos) { pos = length; } if(pos != lastPos || !trimEmpty) tokens.push_back(value_type(str.data()+lastPos, (size_type)pos-lastPos )); lastPos = pos + 1; } }

Por lo general, elijo usar los tipos std::vector<std::string> como mi segundo parámetro ( ContainerT ) ... pero list<> es mucho más rápido que vector<> para cuando no se necesita acceso directo, e incluso puedes crear tu propia clase de cadena y usa algo como std::list<subString> donde subString no hace ninguna copia para aumentos de velocidad increíbles.

Es más del doble de rápido que el tokenize más rápido en esta página y casi 5 veces más rápido que otros. También con los tipos de parámetros perfectos puede eliminar todas las copias de cadenas y listas para aumentos de velocidad adicionales.

Además, no hace el retorno del resultado (extremadamente ineficiente), sino que pasa los tokens como referencia, lo que también le permite crear tokens utilizando múltiples llamadas si así lo desea.

Por último, le permite especificar si desea recortar tokens vacíos de los resultados a través de un último parámetro opcional.

Todo lo que necesita es std::string ... el resto son opcionales. No utiliza streams o la biblioteca boost, pero es lo suficientemente flexible como para poder aceptar algunos de estos tipos extranjeros de forma natural.

Para lo que vale, aquí hay otra forma de extraer tokens de una cadena de entrada, confiando solo en las instalaciones de la biblioteca estándar. Es un ejemplo del poder y la elegancia detrás del diseño de la STL.

#include <iostream> #include <string> #include <sstream> #include <algorithm> #include <iterator> int main() { using namespace std; string sentence = "And I feel fine..."; istringstream iss(sentence); copy(istream_iterator<string>(iss), istream_iterator<string>(), ostream_iterator<string>(cout, "/n")); }

En lugar de copiar los tokens extraídos a un flujo de salida, uno podría insertarlos en un contenedor, utilizando el mismo algoritmo de copy genérico.

vector<string> tokens; copy(istream_iterator<string>(iss), istream_iterator<string>(), back_inserter(tokens));

... o crea el vector directamente:

vector<string> tokens{istream_iterator<string>{iss}, istream_iterator<string>{}};

Si le gusta usar boost, pero desea usar una cadena completa como delimitador (en lugar de caracteres simples como en la mayoría de las soluciones propuestas anteriormente), puede usar boost_split_iterator .

Código de ejemplo que incluye una plantilla conveniente:

#include <iostream> #include <vector> #include <boost/algorithm/string.hpp> template<typename _OutputIterator> inline void split( const std::string& str, const std::string& delim, _OutputIterator result) { using namespace boost::algorithm; typedef split_iterator<std::string::const_iterator> It; for(It iter=make_split_iterator(str, first_finder(delim, is_equal())); iter!=It(); ++iter) { *(result++) = boost::copy_range<std::string>(*iter); } } int main(int argc, char* argv[]) { using namespace std; vector<string> splitted; split("HelloFOOworldFOO!", "FOO", back_inserter(splitted)); // or directly to console, for example split("HelloFOOworldFOO!", "FOO", ostream_iterator<string>(cout, "/n")); return 0; }

Tengo una solución de 2 líneas para este problema:

char sep = '' ''; std::string s="1 This is an example"; for(size_t p=0, q=0; p!=s.npos; p=q) std::cout << s.substr(p+(p!=0), (q=s.find(sep, p+1))-p-(p!=0)) << std::endl;

Luego en lugar de imprimir puedes ponerlo en un vector.

Una posible solución utilizando Boost podría ser:

#include <boost/algorithm/string.hpp> std::vector<std::string> strs; boost::split(strs, "string to split", boost::is_any_of("/t "));

Este enfoque podría ser incluso más rápido que el enfoque de stringstream . Y como esta es una función de plantilla genérica, se puede usar para dividir otros tipos de cadenas (wchar, etc. o UTF-8) utilizando todo tipo de delimitadores.

Consulte la documentation para más detalles.

Usar std::stringstream como funciona perfectamente, y hacer exactamente lo que quería. Sin embargo, si solo está buscando una forma diferente de hacer las cosas, puede usar std::find() / std::find_first_of() y std::string::substr() .

Aquí hay un ejemplo:

#include <iostream> #include <string> int main() { std::string s("Somewhere down the road"); std::string::size_type prev_pos = 0, pos = 0; while( (pos = s.find('' '', pos)) != std::string::npos ) { std::string substring( s.substr(prev_pos, pos-prev_pos) ); std::cout << substring << ''/n''; prev_pos = ++pos; } std::string substring( s.substr(prev_pos, pos-prev_pos) ); // Last word std::cout << substring << ''/n''; return 0; }

Yo uso esto para dividir la cadena por un delimitador. El primero coloca los resultados en un vector preconstruido, el segundo devuelve un nuevo vector.

#include <string> #include <sstream> #include <vector> #include <iterator> template<typename Out> void split(const std::string &s, char delim, Out result) { std::stringstream ss(s); std::string item; while (std::getline(ss, item, delim)) { *(result++) = item; } } std::vector<std::string> split(const std::string &s, char delim) { std::vector<std::string> elems; split(s, delim, std::back_inserter(elems)); return elems; }

Tenga en cuenta que esta solución no omite tokens vacíos, por lo que los siguientes encontrarán 4 elementos, uno de los cuales está vacío:

std::vector<std::string> x = split("one:two::three", '':'');

¿Qué pasa con esto?

#include <string> #include <vector> using namespace std; vector<string> split(string str, const char delim) { vector<string> v; string tmp; for(string::const_iterator i; i = str.begin(); i <= str.end(); ++i) { if(*i != delim && i != str.end()) { tmp += *i; } else { v.push_back(tmp); tmp = ""; } } return v; }

Aquí hay otra forma de hacerlo ..

void split_string(string text,vector<string>& words) { int i=0; char ch; string word; while(ch=text[i++]) { if (isspace(ch)) { if (!word.empty()) { words.push_back(word); } word = ""; } else { word += ch; } } if (!word.empty()) { words.push_back(word); } }

Corto y elegante

#include <vector> #include <string> using namespace std; vector<string> split(string data, string token) { vector<string> output; size_t pos = string::npos; // size_t to avoid improbable overflow do { pos = data.find(token); output.push_back(data.substr(0, pos)); if (string::npos != pos) data = data.substr(pos + token.size()); } while (string::npos != pos); return output; }

puede usar cualquier cadena como delimitador, también se puede usar con datos binarios (std :: string admite datos binarios, incluidos los nulos)

utilizando:

auto a = split("this!!is!!!example!string", "!!");

salida:

this is !example!string

Me gusta usar los métodos boost / regex para esta tarea, ya que proporcionan la máxima flexibilidad para especificar los criterios de división.

#include <iostream> #include <string> #include <boost/regex.hpp> int main() { std::string line("A:::line::to:split"); const boost::regex re(":+"); // one or more colons // -1 means find inverse matches aka split boost::sregex_token_iterator tokens(line.begin(),line.end(),re,-1); boost::sregex_token_iterator end; for (; tokens != end; ++tokens) std::cout << *tokens << std::endl; }

El siguiente código se utiliza strtok()para dividir una cadena en tokens y almacena los tokens en un vector.

#include <iostream> #include <algorithm> #include <vector> #include <string> using namespace std; char one_line_string[] = "hello hi how are you nice weather we are having ok then bye"; char seps[] = " ,/t/n"; char *token; int main() { vector<string> vec_String_Lines; token = strtok( one_line_string, seps ); cout << "Extracting and storing data in a vector../n/n/n"; while( token != NULL ) { vec_String_Lines.push_back(token); token = strtok( NULL, seps ); } cout << "Displaying end result in vector line storage../n/n"; for ( int i = 0; i < vec_String_Lines.size(); ++i) cout << vec_String_Lines[i] << "/n"; cout << "/n/n/n"; return 0; }

Esta respuesta toma la cadena y la coloca en un vector de cadenas. Utiliza la biblioteca boost.

#include <boost/algorithm/string.hpp> std::vector<std::string> strs; boost::split(strs, "string to split", boost::is_any_of("/t "));

Hice esto porque necesitaba una forma fácil de dividir cadenas y cadenas basadas en c ... Esperemos que otra persona también pueda encontrarlo útil. Además, no se basa en tokens y puedes usar campos como delimitadores, que es otra clave que necesitaba.

Estoy seguro de que se pueden hacer mejoras para mejorar aún más su elegancia y, por favor, hazlo por todos los medios.

StringSplitter.hpp:

#include <vector> #include <iostream> #include <string.h> using namespace std; class StringSplit { private: void copy_fragment(char*, char*, char*); void copy_fragment(char*, char*, char); bool match_fragment(char*, char*, int); int untilnextdelim(char*, char); int untilnextdelim(char*, char*); void assimilate(char*, char); void assimilate(char*, char*); bool string_contains(char*, char*); long calc_string_size(char*); void copy_string(char*, char*); public: vector<char*> split_cstr(char); vector<char*> split_cstr(char*); vector<string> split_string(char); vector<string> split_string(char*); char* String; bool do_string; bool keep_empty; vector<char*> Container; vector<string> ContainerS; StringSplit(char * in) { String = in; } StringSplit(string in) { size_t len = calc_string_size((char*)in.c_str()); String = new char[len + 1]; memset(String, 0, len + 1); copy_string(String, (char*)in.c_str()); do_string = true; } ~StringSplit() { for (int i = 0; i < Container.size(); i++) { if (Container[i] != NULL) { delete[] Container[i]; } } if (do_string) { delete[] String; } } };

StringSplitter.cpp:

#include <string.h> #include <iostream> #include <vector> #include "StringSplit.hpp" using namespace std; void StringSplit::assimilate(char*src, char delim) { int until = untilnextdelim(src, delim); if (until > 0) { char * temp = new char[until + 1]; memset(temp, 0, until + 1); copy_fragment(temp, src, delim); if (keep_empty || *temp != 0) { if (!do_string) { Container.push_back(temp); } else { string x = temp; ContainerS.push_back(x); } } else { delete[] temp; } } } void StringSplit::assimilate(char*src, char* delim) { int until = untilnextdelim(src, delim); if (until > 0) { char * temp = new char[until + 1]; memset(temp, 0, until + 1); copy_fragment(temp, src, delim); if (keep_empty || *temp != 0) { if (!do_string) { Container.push_back(temp); } else { string x = temp; ContainerS.push_back(x); } } else { delete[] temp; } } } long StringSplit::calc_string_size(char* _in) { long i = 0; while (*_in++) { i++; } return i; } bool StringSplit::string_contains(char* haystack, char* needle) { size_t len = calc_string_size(needle); size_t lenh = calc_string_size(haystack); while (lenh--) { if (match_fragment(haystack + lenh, needle, len)) { return true; } } return false; } bool StringSplit::match_fragment(char* _src, char* cmp, int len) { while (len--) { if (*(_src + len) != *(cmp + len)) { return false; } } return true; } int StringSplit::untilnextdelim(char* _in, char delim) { size_t len = calc_string_size(_in); if (*_in == delim) { _in += 1; return len - 1; } int c = 0; while (*(_in + c) != delim && c < len) { c++; } return c; } int StringSplit::untilnextdelim(char* _in, char* delim) { int s = calc_string_size(delim); int c = 1 + s; if (!string_contains(_in, delim)) { return calc_string_size(_in); } else if (match_fragment(_in, delim, s)) { _in += s; return calc_string_size(_in); } while (!match_fragment(_in + c, delim, s)) { c++; } return c; } void StringSplit::copy_fragment(char* dest, char* src, char delim) { if (*src == delim) { src++; } int c = 0; while (*(src + c) != delim && *(src + c)) { *(dest + c) = *(src + c); c++; } *(dest + c) = 0; } void StringSplit::copy_string(char* dest, char* src) { int i = 0; while (*(src + i)) { *(dest + i) = *(src + i); i++; } } void StringSplit::copy_fragment(char* dest, char* src, char* delim) { size_t len = calc_string_size(delim); size_t lens = calc_string_size(src); if (match_fragment(src, delim, len)) { src += len; lens -= len; } int c = 0; while (!match_fragment(src + c, delim, len) && (c < lens)) { *(dest + c) = *(src + c); c++; } *(dest + c) = 0; } vector<char*> StringSplit::split_cstr(char Delimiter) { int i = 0; while (*String) { if (*String != Delimiter && i == 0) { assimilate(String, Delimiter); } if (*String == Delimiter) { assimilate(String, Delimiter); } i++; String++; } String -= i; delete[] String; return Container; } vector<string> StringSplit::split_string(char Delimiter) { do_string = true; int i = 0; while (*String) { if (*String != Delimiter && i == 0) { assimilate(String, Delimiter); } if (*String == Delimiter) { assimilate(String, Delimiter); } i++; String++; } String -= i; delete[] String; return ContainerS; } vector<char*> StringSplit::split_cstr(char* Delimiter) { int i = 0; size_t LenDelim = calc_string_size(Delimiter); while(*String) { if (!match_fragment(String, Delimiter, LenDelim) && i == 0) { assimilate(String, Delimiter); } if (match_fragment(String, Delimiter, LenDelim)) { assimilate(String,Delimiter); } i++; String++; } String -= i; delete[] String; return Container; } vector<string> StringSplit::split_string(char* Delimiter) { do_string = true; int i = 0; size_t LenDelim = calc_string_size(Delimiter); while (*String) { if (!match_fragment(String, Delimiter, LenDelim) && i == 0) { assimilate(String, Delimiter); } if (match_fragment(String, Delimiter, LenDelim)) { assimilate(String, Delimiter); } i++; String++; } String -= i; delete[] String; return ContainerS; }

Ejemplos:

int main(int argc, char*argv[]) { StringSplit ss = "This:CUT:is:CUT:an:CUT:example:CUT:cstring"; vector<char*> Split = ss.split_cstr(":CUT:"); for (int i = 0; i < Split.size(); i++) { cout << Split[i] << endl; } return 0; }

Saldrá:

Esta
es
un
ejemplo
cuerda C

int main(int argc, char*argv[]) { StringSplit ss = "This:is:an:example:cstring"; vector<char*> Split = ss.split_cstr('':''); for (int i = 0; i < Split.size(); i++) { cout << Split[i] << endl; } return 0; } int main(int argc, char*argv[]) { string mystring = "This[SPLIT]is[SPLIT]an[SPLIT]example[SPLIT]string"; StringSplit ss = mystring; vector<string> Split = ss.split_string("[SPLIT]"); for (int i = 0; i < Split.size(); i++) { cout << Split[i] << endl; } return 0; } int main(int argc, char*argv[]) { string mystring = "This|is|an|example|string"; StringSplit ss = mystring; vector<string> Split = ss.split_string(''|''); for (int i = 0; i < Split.size(); i++) { cout << Split[i] << endl; } return 0; }

Para mantener las entradas vacías (por defecto se excluirán los vacíos):

StringSplit ss = mystring; ss.keep_empty = true; vector<string> Split = ss.split_string(":DELIM:");

El objetivo era hacerlo similar al método Split () de C #, donde dividir una cadena es tan fácil como:

String[] Split = "Hey:cut:what''s:cut:your:cut:name?".Split(new[]{":cut:"}, StringSplitOptions.None); foreach(String X in Split) { Console.Write(X); }

Espero que alguien más pueda encontrar esto tan útil como yo.

Obtener Boost ! : -)

#include <boost/algorithm/string/split.hpp> #include <boost/algorithm/string.hpp> #include <iostream> #include <vector> using namespace std; using namespace boost; int main(int argc, char**argv) { typedef vector < string > list_type; list_type list; string line; line = "Somewhere down the road"; split(list, line, is_any_of(" ")); for(int i = 0; i < list.size(); i++) { cout << list[i] << endl; } return 0; }

Este ejemplo da la salida -

Somewhere down the road

Recientemente tuve que dividir una palabra en camello en subpuntos. No hay delimitadores, solo caracteres superiores.

#include <string> #include <list> #include <locale> // std::isupper template<class String> const std::list<String> split_camel_case_string(const String &s) { std::list<String> R; String w; for (String::const_iterator i = s.begin(); i < s.end(); ++i) { { if (std::isupper(*i)) { if (w.length()) { R.push_back(w); w.clear(); } } w += *i; } if (w.length()) R.push_back(w); return R; }

Por ejemplo, esto divide "AQueryTrades" en "A", "Consulta" y "Operaciones". La función funciona con cuerdas estrechas y anchas. Porque respeta la configuración regional actual, divide "RaumfahrtÜberwachungsVerordnung" en "Raumfahrt", "Überwachungs" y "Verordnung".

La nota std::upperdebe pasarse realmente como argumento de plantilla de función. Entonces, lo más generalizado de esta función puede dividirse en delimitadores como ",", ";"o " "también.

#include <vector> #include <string> #include <sstream> int main() { std::string str("Split me by whitespaces"); std::string buf; // Have a buffer string std::stringstream ss(str); // Insert the string into a stream std::vector<std::string> tokens; // Create vector to hold our words while (ss >> buf) tokens.push_back(buf); return 0; }