Really elementary question but I can't get this to work. My sample text is provided in the bottom of the page.
The only row I want left is the ones looking like this: "178-207 30 WVRTRWALLLLFWLGWLGMLAGAVVIIVRA -3,95". I currently use TextWrangler on OSX (terminal and me are not friends) which provide regex replacements.
I am trying to do this in steps, and my first step is trying to get rid of all the protein sequences.
In TextWrangler, I search for this:
Working sequence([^;]*)------------------------------------------------------------
and replace with nothing. However, what I end up with is almost an empty document, as TextWrangler seems to find the first instance of "Working sequence", but the LAST instance of "------------------------------------------------------------". How do I change so this is a step-wise process, finding the first instances of both and replacing with nothing, then the second instance etc?
Thanks and greetings from Sweden
Results summary for protein: sp|P08195|4F2_HUMAN 4F2 GN=SLC3A2 PE=1 SV=3
Translocon TM Analysis Results
Partitioning: water to bilayer
Window range: 19-30
Number of translocon TM predicted segments: 2
178-207 30 WVRTRWALLLLFWLGWLGMLAGAVVIIVRA -3,95
438-460 23 ARLLTSFLPAQLLRLYQLMLFTL 1,63
Working sequence length = 630):
MELQPPEASIAVVSIPRQLPGShSEAGVQGLSAGDDSELGShCVAQTGLELLASGDPLPS
ASQNAEMIETGSDCVTQAGLQLLASSDPPALASKNAEVTGTMSQDTEVDMKEVELNELEP
EKQPMNAASGAAMSLAGAEKNGLVKIKVAEDEAEAAAAAKFTGLSKEELLKVAGSPGWVR
TRWALLLLFWLGWLGMLAGAVVIIVRAPRCRELPAQKWWhTGALYRIGDLQAFQGhGAGN
LAGLKGRLDYLSSLKVKGLVLGPIhKNQKDDVAQTDLLQIDPNFGSKEDFDSLLQSAKKK
SIRVILDLTPNYRGENSWFSTQVDTVATKVKDALEFWLQAGVDGFQVRDIENLKDASSFL
AEWQNITKGFSEDRLLIAGTNSSDLQQILSLLESNKDLLLTSSYLSDSGSTGEhTKSLVT
QYLNATGNRWCSWSLSQARLLTSFLPAQLLRLYQLMLFTLPGTPVFSYGDEIGLDAAALP
GQPMEAPVMLWDESSFPDIPGAVSANMTVKGQSEDPGSLLSLFRRLSDQRSKERSLLhGD
FhAFSAGPGLFSYIRhWDQNERFLVVLNFGDVGLSAGLQASDLPASASLPAKADLLLSTQ
PGREEGSPLELERLKLEPhEGLLLRFPYAA
Results summary for protein: sp|Q9NPC4|A4GAT_HUMAN OS=Homo sapiens GN=A4GALT PE=2 SV=1
Translocon TM Analysis Results
Partitioning: water to bilayer
Window range: 19-30
Number of translocon TM predicted segments: 1
19-43 25 RVCTLFIIGFKFTFFVSIMIYWhVV -1,04
Working sequence length = 353):
MSKPPDLLLRLLRGAPRQRVCTLFIIGFKFTFFVSIMIYWhVVGEPKEKGQLYNLPAEIP
CPTLTPPTPPShGPTPGNIFFLETSDRTNPNFLFMCSVESAARThPEShVLVLMKGLPGG
NASLPRhLGISLLSCFPNVQMLPLDLRELFRDTPLADWYAAVQGRWEPYLLPVLSDASRI
ALMWKFGGIYLDTDFIVLKNLRNLTNVLGTQSRYVLNGAFLAFERRhEFMALCMRDFVDh
YNGWIWGhQGPQLLTRVFKKWCSIRSLAESRACRGVTTLPPEAFYPIPWQDWKKYFEDIN
PEELPRLLSATYAVhVWNKKSQGTRFEATSRALLAQLhARYCPTThEAMKMYL