Previous CloneSet | Next CloneSet | Back to Main Report |
Clone Mass | Clones in CloneSet | Parameter Count | Clone Similarity | Syntax Category [Sequence Length] |
---|---|---|---|---|
20 | 3 | 5 | 0.976 | stmt_list[3] |
Clone Abstraction | Parameter Bindings |
Clone Instance (Click to see clone) | Line Count | Source Line | Source File |
---|---|---|---|
1 | 23 | 999 | Bio/SeqIO/QualityIO.py |
2 | 21 | 1161 | Bio/SeqIO/QualityIO.py |
3 | 20 | 1210 | Bio/SeqIO/QualityIO.py |
| ||||
#Originally, I used a list expression for each record: # # qualities = [ord(letter)-SANGER_SCORE_OFFSET for letter in quality_string] # #Precomputing is faster, perhaps partly by avoiding the subtractions. q_mapping = dict( ) for letter in range(0,255): q_mapping[chr(letter)] = letter-SANGER_SCORE_OFFSET for title_line,seq_string,quality_string in FastqGeneralIterator(handle): if title2ids: id,name,descr = title2ids(title_line) else: descr = title_line id = descr.split( )[0] name = id record = SeqRecord(Seq(seq_string,alphabet),id = id,name = name,description = descr) qualities = [q_mapping[letter] for letter in quality_string] if qualities and (min(qualities)<0 or max(qualities)>93): raise ValueError("Invalid character in quality string") #For speed, will now use a dirty trick to speed up assigning the #qualities. We do this to bypass the length check imposed by the #per-letter-annotations restricted dict (as this has already been #checked by FastqGeneralIterator). This is equivalent to: #record.letter_annotations["phred_quality"] = qualities dict.__setitem__(record._per_letter_annotations,"phred_quality",qualities) yield record #This is a generator function! |
| ||||
q_mapping = dict( ) for letter in range(0,255): q_mapping[chr(letter)] = letter-SOLEXA_SCORE_OFFSET for title_line,seq_string,quality_string in FastqGeneralIterator(handle): if title2ids: id,name,descr = title_line else: descr = title_line id = descr.split( )[0] name = id record = SeqRecord(Seq(seq_string,alphabet),id = id,name = name,description = descr) qualities = [q_mapping[letter] for letter in quality_string] #DO NOT convert these into PHRED qualities automatically! if qualities and (min(qualities)< -5 or max(qualities)>62): raise ValueError("Invalid character in quality string") #Dirty trick to speed up this line: #record.letter_annotations["solexa_quality"] = qualities dict.__setitem__(record._per_letter_annotations,"solexa_quality",qualities) yield record #This is a generator function! |
| ||||
q_mapping = dict( ) for letter in range(0,255): q_mapping[chr(letter)] = letter-SOLEXA_SCORE_OFFSET for title_line,seq_string,quality_string in FastqGeneralIterator(handle): if title2ids: id,name,descr = title2ids(title_line) else: descr = title_line id = descr.split( )[0] name = id record = SeqRecord(Seq(seq_string,alphabet),id = id,name = name,description = descr) qualities = [q_mapping[letter] for letter in quality_string] if qualities and (min(qualities)<0 or max(qualities)>62): raise ValueError("Invalid character in quality string") #Dirty trick to speed up this line: #record.letter_annotations["phred_quality"] = qualities dict.__setitem__(record._per_letter_annotations,"phred_quality",qualities) yield record |
| |||
#Originally, I used a list expression for each record: # # qualities = [ord(letter)-SANGER_SCORE_OFFSET for letter in quality_string] # #Precomputing is faster, perhaps partly by avoiding the subtractions. q_mapping = dict( ) for letter in range(0,255): q_mapping[chr(letter)] = letter- [[#variable5de44ec0]] for title_line,seq_string,quality_string in FastqGeneralIterator(handle): if title2ids: id,name,descr = [[#variable600df1e0]] else: descr = title_line id = descr.split( )[0] name = id record = SeqRecord(Seq(seq_string,alphabet),id = id,name = name,description = descr) qualities = [q_mapping[letter] for letter in quality_string] #DO NOT convert these into PHRED qualities automatically! if qualities and (min(qualities)< [[#variable5de45020]]or max(qualities)> [[#variable76b96f60]]): raise ValueError("Invalid character in quality string") #Dirty trick to speed up this line: #For speed, will now use a dirty trick to speed up assigning the #qualities. We do this to bypass the length check imposed by the #per-letter-annotations restricted dict (as this has already been #checked by FastqGeneralIterator). This is equivalent to: #record.letter_annotations["phred_quality"] = qualities #record.letter_annotations["solexa_quality"] = qualities dict.__setitem__(record._per_letter_annotations, [[#variable76b92ba0]],qualities) yield record #This is a generator function! |
CloneAbstraction |
Parameter Index | Clone Instance | Parameter Name | Value |
---|---|---|---|
1 | 1 | [[#5de44ec0]] | SOLEXA_SCORE_OFFSET |
1 | 2 | [[#5de44ec0]] | SOLEXA_SCORE_OFFSET |
1 | 3 | [[#5de44ec0]] | SANGER_SCORE_OFFSET |
2 | 1 | [[#600df1e0]] | title2ids(title_line) |
2 | 2 | [[#600df1e0]] | title_line |
2 | 3 | [[#600df1e0]] | title2ids(title_line) |
3 | 1 | [[#5de45020]] | 0 |
3 | 2 | [[#5de45020]] | -5 |
3 | 3 | [[#5de45020]] | 0 |
4 | 1 | [[#76b96f60]] | 62 |
4 | 2 | [[#76b96f60]] | 62 |
4 | 3 | [[#76b96f60]] | 93 |
5 | 1 | [[#76b92ba0]] | "phred_quality" |
5 | 2 | [[#76b92ba0]] | "solexa_quality" |
5 | 3 | [[#76b92ba0]] | "phred_quality" |