|
Very good ...
I am interested how you done it - would you provide the code ?
|
|
|
|
|
Hi, The Types of your Columns tell me that you want to extract typed objects from your text-data. As Ralf suggests, you are going to have to create a parser to convert text to typed objects, and a typed structure to hold the typed objects. That structure could be a Class, a DataTable, or, a Struct.
But first, consider the source of your data: will that source provide the data in some structured format like XML, JSON, or, even CSV. If the data is in any one of those formats, your work is much simpler.
If it is your code generating the data, then you could focus on simply writing a serializer and de-serializer for your objects/classes. Not hard to do these days.
Also, if there are relatively few data in the text file that you need to parse into objects, and display in the DataGridView, perhaps a "scraping" technique would get what you need, possibly by use of a RegEx.
Suggestion: make an outline of the underlying structure your text file embodies; assuming many instances of the structure in one text file (?): define what markers define the beginning and end of each instance. What fields are always present, which are, optionally, omitted.
Sketch: (ref. : [^])
using System;
using System.IO;
using System.Runtime.Serialization;
namespace MasterBlaster
{
[DataContract]
public class Blast
{
[DataMember]
public string BlastId { set; get; }
[DataMember]
public DateTime BlastTime { set; get; }
[DataMember]
public int BlastInitialDets { set; get; }
[DataMember]
public string[] BlastInitialData { set; get; }
public string tempInfo;
public Blast(string id, int idets, params string[] idata)
{
BlastId = id;
BlastInitialDets = idets;
BlastInitialData = idata;
}
public static class BlastsSerializer
{
static DataContractSerializer dcs = new DataContractSerializer(typeof(Blast[]));
public static void Serialize(string filepath, Blast[] blasts)
{
Directory.CreateDirectory(Path.GetDirectoryName(filepath));
using (var writer = new FileStream(filepath, FileMode.Open, FileAccess.Write))
{
dcs.WriteObject(writer, blasts);
}
}
public static Blast[] DeSerialize(string filepath)
{
Blast[] blasts;
using (var reader = new FileStream(filepath, FileMode.Open, FileAccess.Read))
{
blasts = dcs.ReadObject(reader) as Blast[];
}
return blasts;
}
}
}
} cheers, Bill
«Beauty is in the eye of the beholder, and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.» Miss Piggy
modified 6-Jul-17 7:37am.
|
|
|
|
|
My code started out as something fairly easy to follow like this:
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e)
{
SortedSet<T> seen = new SortedSet<T>();
StringBuilder stringBuilder = new StringBuilder();
foreach (string line in this.InputTextBox.Text.Split(new char[] { '\r', '\n' }))
{
if (seen.Add(line))
{
stringBuilder.AppendLine(line);
}
}
this.InputTextBox.Text = stringBuilder.ToString();
}
Then I factored out some of the code into extension methods, removed the unnecessary curly braces, and got this:
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e)
{
StringBuilder stringBuilder = new StringBuilder();
foreach (string line in this.InputTextBox.Text.SplitLines().Unique<string>())
stringBuilder.AppendLine(line);
this.InputTextBox.Text = stringBuilder.ToString();
}
And then I took it a step further.
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e) =>
this.InputTextBox.Text = this.InputTextBox.Text.SplitLines().Unique<string>().Aggregate<string, StringBuilder, string>(
new StringBuilder(),
(StringBuilder stringBuilder, string line) => stringBuilder.AppendLine(line),
(StringBuilder stringBuilder) => stringBuilder.ToString());
And the one-liner:
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e) => this.InputTextBox.Text = this.InputTextBox.Text.SplitLines().Unique().Aggregate(new StringBuilder(), (sb, line) => sb.AppendLine(line), sb => sb.ToString());
I get great joy as a programmer by factoring out as much code as possible, but at which point should I have stopped for this example? Should I perhaps switch to another programming language that’s more conducive to this style of writing?
|
|
|
|
|
You shouldn't be doing that directly in the Click event handler. You need to refactor that functionality into a library of such routines.
|
|
|
|
|
The first method is a good example of "evil" comments - they just repeat the code in different words.
Fighting Evil in Your Code: Comments on Comments - Simple Talk[^]
I'd be inclined to move the final Aggregate to its own extension method - JoinLines perhaps? You can also simplify it by using String.Join[^]:
public static string JoinLines(this IEnumerable<string> lines) => string.Join(Environment.NewLine, lines);
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e) => this.InputTextBox.Text = this.InputTextBox.Text.SplitLines().Unique().JoinLines();
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Great minds think alike
This space for rent
|
|
|
|
|
Kevin Li (Li, Ken-un) wrote: but at which point should I have stopped for this example?
I think what you've done is fantastic. It's considerably more readable than the first implementation.
Kevin Li (Li, Ken-un) wrote: Should I perhaps switch to another programming language that’s more conducive to this style of writing?
Your only other option to achieve this level of elegance (as far as I know and without dealing with arcane symbols) is F#, IMO.
But I'm with Richard -- I think the Aggregate is unnecessary.
Marc
Latest Article - Create a Dockerized Python Fiddle Web App
Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny
Artificial intelligence is the only remedy for natural stupidity. - CDP1802
|
|
|
|
|
In Backus' paper introducing Fortran (1957), the primary focus was on the speed of writing code; a secondary focus was on debugging.
Uncle Bob's most famous book "Clean Code" (2008) focuses on readability.
Why do we see that paradigm shift?
In 1957, most projects were green-field projects: for a problem, new code was written, the problem was solved and the code was no more needed. Old code need not be maintained.
Nowadays, maintenance of code (brown-field projects) dominate the scene. We have to read code for more often than we write code.
And now simply ask yourself: when someone else - or even you after a couple of weeks - reads that code, how long will it take him to understand it?
|
|
|
|
|
I personally would have left it at the first one but wouldn't have included the first three comments.
|
|
|
|
|
One minor point, you could avoid the StringBuilder in your first example with the use of string.Join . I'm not saying you should, just that you could.
SortedSet<string> seen = new SortedSet<string>();
foreach (string line in longString.Split('\r', '\n'))
{
seen.Add(line);
}
return string.Join(Environment.NewLine, seen); If you're interested, this returns this IL:
.maxstack 5
.locals init (
[0] class [System]System.Collections.Generic.SortedSet`1<string> seen,
[1] string[] strArray,
[2] int32 num,
[3] string line,
[4] string str)
L_0000: nop
L_0001: newobj instance void [System]System.Collections.Generic.SortedSet`1<string>::.ctor()
L_0006: stloc.0
L_0007: nop
L_0008: ldarg.1
L_0009: ldc.i4.2
L_000a: newarr char
L_000f: dup
L_0010: ldc.i4.0
L_0011: ldc.i4.s 13
L_0013: stelem.i2
L_0014: dup
L_0015: ldc.i4.1
L_0016: ldc.i4.s 10
L_0018: stelem.i2
L_0019: callvirt instance string[] [mscorlib]System.String::Split(char[])
L_001e: stloc.1
L_001f: ldc.i4.0
L_0020: stloc.2
L_0021: br.s L_0035
L_0023: ldloc.1
L_0024: ldloc.2
L_0025: ldelem.ref
L_0026: stloc.3
L_0027: nop
L_0028: ldloc.0
L_0029: ldloc.3
L_002a: callvirt instance bool [System]System.Collections.Generic.SortedSet`1<string>::Add(!0)
L_002f: pop
L_0030: nop
L_0031: ldloc.2
L_0032: ldc.i4.1
L_0033: add
L_0034: stloc.2
L_0035: ldloc.2
L_0036: ldloc.1
L_0037: ldlen
L_0038: conv.i4
L_0039: blt.s L_0023
L_003b: call string [mscorlib]System.Environment::get_NewLine()
L_0040: ldloc.0
L_0041: call string [mscorlib]System.String::Join(string, class [mscorlib]System.Collections.Generic.IEnumerable`1<string>)
L_0046: stloc.s str
L_0048: br.s L_004a
L_004a: ldloc.s str
L_004c: ret
This space for rent
|
|
|
|
|
Okay, I pondered and pondered and realised that you could refactor to this:
void RemoveDuplicatesButton_Click(object sender, RoutedEventArgs e) => this.InputTextBox.Text = string.Join(Environment.NewLine, new SortedSet<string>(longString.Split('\r', '\n'))); It's as easy as that.
This space for rent
|
|
|
|
|
Does the fact that you pondered so long not suggest you might be over-engineering it? There will probably be an equal amount of pondering to working out how it works.
Each to their own, but I much prefer a few simple lines to one clever one.
I suspect you've lost the preservation of order of the original implementation, and why are people using a SortedSet rather than a HashSet?
Regards,
Rob Philpott.
|
|
|
|
|
You do realise that this was in response to the original poster don't you? I'm not a big fan of "clever code" so I wouldn't tend to write my code like this, this was just a way to show how this could have been done with the basics that were already present without relying on the extra scaffolding the OP put in. As for why a SortedSet, that's what the OP used so I have followed that; presumably he needs the output to be sorted, hence the SortedSet.
This space for rent
|
|
|
|
|
I must be stupid, since I wind up having to "decompose" this stuff when the "intermediate" results and "lazy execution" starts yielding other than the results I expect (even if it was due to my own "mind fog").
Then what? Put it "back together" again for the next oaf?
Perhaps "any idiot" can figured this one out, but at what point is it "too much"? And who says so?
This is the opposite extreme of posters who have been chasticed for failing to cater to the lowest common denominator (when "they" used LINQ instead of something less "obtuse").
"(I) am amazed to see myself here rather than there ... now rather than then".
― Blaise Pascal
|
|
|
|
|
I prefer to write code that is maintainable rather than elegant.
|
|
|
|
|
That's multiple statements in a single line.
Are you paying for each line-feed character you use?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
imho, you are doing a lot of extra work:
textBox1.Lines = textBox1.Lines.Distinct().OrderBy(str => str).ToArray(); However, if you anticipate that in the future your code will be read by people who are Linq-illiterate, then: whatever.
«Beauty is in the eye of the beholder, and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.» Miss Piggy
|
|
|
|
|
Hi Bill, but ah.. You're ordering the strings alphabetically. Original implementation preserved the order just removing duplicates.
Nice implementation though (TextBox.Lines not requiring line delimiting). If you drop the OrderBy(), I think there are no guarantees about preserving order.
I have an inkling there's a select method which has a second 'index' parameter which you could order by.
Regards,
Rob Philpott.
|
|
|
|
|
Hi, Rob, I assumed, based on the OP's use of a SortedSet that sorting was required. If a more fancy sort is required, then, of course, you could write a custom sort function.
My observation of the behavior of 'Distinct is that eliminates duplicates whose ordinal position is greater in the structure, but, there could well be dimensions of its behavior I am unaware of for other Types/Collections.
cheers, Bill
«Beauty is in the eye of the beholder, and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.» Miss Piggy
|
|
|
|
|
Rob Philpott wrote: If you drop the OrderBy(), I think there are no guarantees about preserving order.
The documentation[^] doesn't seem to explicitly mention it, other than saying it "returns an unordered sequence".
But looking at the source code[^], the sequence returned from Distinct will be in the same order as the input sequence:
static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) {
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource element in source)
if (set.Add(element)) yield return element;
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
I am relieved to know (for once) the source matches my observation of a very limited sample-set
cheers, Bill
«Beauty is in the eye of the beholder, and it may be necessary from time to time to give a stupid or misinformed beholder a black eye.» Miss Piggy
|
|
|
|
|
Yeah, it's tricky isn't it. Documentation doesn't seem to state that order is guaranteed not to change, but implementation indicates that's the case. I can't really imagine how you can improve much on that implementation either (which is strikingly similar to the initial posted implementation), so it's probably fair to assume no reordering will occur.
But without that cast-iron guarantee, a future version of .NET could scupper things. Well hey, that's what consultancy rates are for.
The bit which has me intrigued now is the Set<t> class. Didn't know there was such a thing. HashSet would be my go to choice, so I presume its an internal-to-framework class.
Regards,
Rob Philpott.
|
|
|
|
|
Rob Philpott wrote: The bit which has me intrigued now is the Set<t> class. Didn't know there was such a thing. HashSet would be my go to choice, so I presume its an internal-to-framework class.
Yes, it's an internal class within the Enumerable class: Enumerable.Set<TElement>[^]
I'd have used a HashSet<T> as well, but I guess MS were probably writing the Distinct method before the HashSet class was finished.
Either that, or there were some very specific performance issues they were trying to work around. But if that was the case, I'd have expected to see a comment explaining the problem they were trying to solve.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Interesting!!
I have never seen a method, like your RemoveDuplicatesButton_Click() use the => syntax at this point?! That actually works?
Ben Scharbach
Temporalwars.Com
YouTube:Ben Scharbach
|
|
|
|
|
i hope be ok
i want to learn how a month calender is writen
i search whole the web and couldent find source code for month calender in csharp
i will be appreciated to help me
|
|
|
|
|