|
Abhipal Singh wrote: 404 Error - answer not found
ROTGDMFFLMGDMFAO
"... having only that moment finished a vigorous game of Wiff-Waff and eaten a tartiflet." - Henry Minute
"Let's face it, after Monday and Tuesday, even the calendar says WTF!" - gavindon
Programming is a race between programmers trying to build bigger and better idiot proof programs, and the universe trying to build bigger and better idiots, so far... the universe is winning. - gavindon
|
|
|
|
|
Who is using LightSwitch[^] to write data(base?) editor?
What's your experience / opinion about it?
|
|
|
|
|
Hi,
I am trying to get better at coding and follow various best practices, among which the DRY principle. As part of that, I wanted to develop a LogHelper library for use in all of my projects. I want to make design it as well as possible, and I am not sure if the approach I intend to take makes sense.
I develop a lot of small applications to be used by engineers in my company - mostly for processing files, QA checks etc - they are all similar in many aspects - one of the things that they share is creation of logs - for the engineers to read and analyze.
Overall, for now I intend to create three types of logs
- Text file logs for users to read
- Csv file logs for users to read
- Error logs for me to troubleshoot
So, in each application I might need three separate logger class instances
I was wondering whether Factory design pattern is the way to go for me. I.e. I create a LogHelperFactory library, which will contain various methods for TxtLogger, CsvLogger, DebugLogger.
(I had a look here: http://www.codeproject.com/Articles/14030/Factory-Design-pattern)
Then in my applications I could call something like
private void DoNewWork()
{
var TextLog = LogHelperFactory.CreateLog(Logs.Txt, textLogFilePath);
var DebugLog = LogHelperFactory.CreateLog(Logs.Debug, debugLogFilePath);
TextLog.Write("Starting work...");
DebugLog.Write("Something else");
}
My main idea is for this to be easily scalable and extensible (e.g. if I want to add a Cloud Database log in the future) and I want it to be safe in terms of threads, file locks etc.
For now it is mostly very simple, i.e. a new log is generated when the user clicks a 'start' button.
I read that a singleton pattern is a good choice for logging, but because you cannot specify things such as log name, location etc in the private constructor it is not the best choice for the 'generic reusable piece of code' that I could reference in various projects.
Thanks in advance for advice
Regards
Bartek
|
|
|
|
|
Bartosz Jarmuż wrote: I was wondering whether Factory design pattern is the way to go for me. Yup.
Have an ILogger interface that all three the classes adhere to, have the factory return one of the three classes, casted to the ILogger.
Bartosz Jarmuż wrote: I read that a singleton pattern is a good choice for logging Only as an interface to access the ILogger.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Thanks for you reply!
So... if I create an ILogger interface, then... In my applications I would use it as follows:
ILogger TxtLog {get;set;}
ILogger DebugLog {get;set;}
ILogger CsvLog {get;set;}
TxtLog = LogFactory.CreateLogger("txt");
DebugLog = LogFactory.CreateLogger("debug");
CsvLog = LogFactory.CreateLogger("csv");
But that would require my separate log types to have the same methods - and I guess I want to have different methods for different types of logs - e.g. csv log will be used to produce sort of reports for the users, debug log will be customized to log errors with severity etc, whereas txt logs will be for users to analyze what the program was doing.
So don't I rather want to specify the log types explicitly?
Also, this leads to 'why use factory and not just three separate classes?' Does the factory pattern forces it to be safer in any way?
|
|
|
|
|
Bartosz Jarmuż wrote: csv log will be used to produce sort of reports for the users, debug log will be customized to log errors with severity etc, whereas txt logs will be for users to analyze what the program was doing. There are many ways that you can solve that and still keep your interface clean - what you are describing are settings, and they could have different types, but the interface would be the same. You might want to implement this, for example, using settings strategies, or you could have a base ISettings interface which the ILog exposes and each type of setting could implement this - and you would get the relevant settings created as part of your factory. It's really up to you how you want to do this, but coding to interfaces should be your starting point - especially as you can use this to mock out tests very easily.
|
|
|
|
|
OK - I will try, thanks!
ALthough, I admit, I cannot imagine how is that supposed to work (creating and using ILoggerSettings interface...).
|
|
|
|
|
I was expecting you'd want to log the same information to several locations; the factory and the interface make it rather easy to swap out the object for another, as long as the class it is based on implements promised interface.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
No, not necessary - In most cases, I will be logging different things into different locations. So, the thing I am concerned about the most is having a single LogHelper that I can use in all my apps to reduce writing the same code.
And also, I want this one helper to be extensible, efficient and safe - e.g. when there will be multiple threads...
|
|
|
|
|
Bartosz Jarmuż wrote: And also, I want this one helper to be extensible, efficient and safe Then don't forget to log the code's actions and exceptions. What logger will you use for your logger?
Bartosz Jarmuż wrote: e.g. when there will be multiple threads... Like this[^]?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Read my article[^]
You don´t have to use that approach, but there might be some tricks and tips in there that could be useful to you.
Hope that helps.
|
|
|
|
|
public void MethodName(ObservableCollection<DataCollection> dataCollection)
{
if (dataCollection != null)
{
IsChecked = dataCollection.Any(o => o.DataCollectionID.Equals(30));
IsChecked = dataCollection.Where(o => o.DataCollectionID.Equals(30)).Count() > 0;
}
}
Can anyone explain me, what could be the most efficient way to use from above two filtering? .Any? or .Where.Count?
Note: Consider that dataCollection has over 10,000 items.
Please advice me. Thank you
|
|
|
|
|
The Any clause should be faster because it can run without an iterator (think yield return here), so it stops when it hits the first true condition, i.e. your collection id equaling 30. The Where clause provides an iterator that is driven by the count, so this must complete the iteration - the actual condition of checking for a count greater than 0 happens after the Where clause finishes evaluating.
|
|
|
|
|
And to add to what Pete says, if you need to know whichis the most efficient, it's simple enough to time it. Use the Stopwatch class[^] and run a good number of times to eliminate external variance.
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
|
|
|
|
|
Hi, They are both right. Just to be more accurate Where() does not necessarily enumerate over all elements, that depends on what you put in the condition. In general the advice is to stick to Any().
|
|
|
|
|
A very good afternoon/time of day to all.
So, I want to read very large files sequentially. By very big, I'm thinking of 1GB up to maybe several 100GB. First question is, how sensible/practical/possible is it to have a file of say 250GB?
Probably not very sensible at all, but I'm observing strange effects even at a couple of GB. 1GB files behave the way I'd expect all the time, but 3GB files often do not.
To demonstrate what I mean, I create 8 files, the first 1GB in size, the subsequent 1GB bigger. File 1 = 1GB, file 8 = 8GB:
private static void Main(string[] args)
{
byte[] buffer = new byte[1 << 20];
Random r = new Random();
for (int mb = 1024; mb <= 8192; mb += 1024)
{
using (FileStream fs = File.Create(string.Format(@"c:\temp\GB{0}.dat", mb >> 10)))
{
for (long index = 0; index < mb; index++)
{
r.NextBytes(buffer);
fs.Write(buffer, 0, buffer.Length);
}
}
}
}
14 minutes later I have my test files, all full of lots of randomness. And all I'm going to do is read each file in its entirety using FileStream.Read:
private static void Main(string[] args)
{
byte[] buffer = new byte[1 << 20];
Random r = new Random();
for (int mb = 1024; mb <= 8192; mb += 1024)
{
Console.Write("GB{0}.dat: ", mb >> 10);
Stopwatch sw = Stopwatch.StartNew();
using (FileStream fs = File.Open(string.Format(@"c:\temp\GB{0}.dat", mb >> 10), FileMode.Open))
{
for (long index = 0; index < mb; index++)
{
fs.Read(buffer, 0, buffer.Length);
}
}
Console.WriteLine("{0:0.00}s, {1:0.00}MB/s", sw.ElapsedMilliseconds / 1000d, mb * 1000d / sw.ElapsedMilliseconds);
}
}
Now, regardless of the file size I'd expect the sequential read speed to me similar, here's what I get:
GB1.dat: 17.54s, 58.38MB/s
GB2.dat: 39.20s, 52.25MB/s
GB3.dat: 149.56s, 20.54MB/s
GB4.dat: 92.97s, 44.06MB/s
GB5.dat: 175.25s, 29.22MB/s
GB6.dat: 84.29s, 72.90MB/s
GB7.dat: 104.96s, 68.29MB/s
GB8.dat: 179.43s, 45.66MB/s
First thought would be, well things slow down when other processes are accessing the disk so the inconsistency could just be other processes doing stuff, but the results seem to be a bit too consistent. At about 3GB, the speed always decreases rapidly.
I don't know what's going on. Windows is clearly caching, the disc probably is, maybe there's some interaction (garbage collection/virtual memory) which arises at certain sizes.
My second question - do you think the size of the file affects the speed which you can read it sequentially? Actually, I'd like random access but let's do the simple things first.
Regards,
Rob Philpott.
|
|
|
|
|
How would "random access" work on a file that large? Imagine someone editing and locking the file
You want a database.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Well, the file is immutable so no locking should be required. I don't know the underlying file system structures but would think that random access would probably degrade with size.
Regards,
Rob Philpott.
|
|
|
|
|
It won't be something that fits in memory. Meaning that it will probably be seeking a lot, starting from the start of the file.
Any special format? CSV, XML?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Variable length byte arrays, somewhere between 50 bytes and 5KB, numbered 0, 1, 2, 3 etc. up to 500 million of them. No more than 1TB in total.
Requirement is quite simple really, retrieve byte array by index as fast as possible (random access), provide sequential access as fast as possible (all of them, one by one - obviously this will take a while).
My current thinking is a format of 4 byte length + byte array repeated, and a separate index file to provide the necessary indirection to cope with the variable length nature.
If there were 100 of them you'd stick them in an array, so this really is just a problem of scale. Quite an interesting one.
Regards,
Rob Philpott.
|
|
|
|
|
Rob Philpott wrote: a separate index file to provide the necessary indirection to cope with the
variable length nature. I'd still recommend a database
Your large file will be fragmented and hard to handle. Since the file doesn't fit in the cache of the harddisk, you'd be reading a lot, mostly to change position.
500 million records are somewhat easier to manage. Would also make it easier to locate a particular instance.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Thanks for your thoughts Eddy.
Hmm, I don't know. The data is immutable, non-rational, non ACID, no fill-factors or btrees or other DB weirdness required. Mad dog that I am I shall keep trying - I'm confident I can do it faster than SQL Server.
Or at least - try.
Regards,
Rob Philpott.
|
|
|
|
|
You're welcome
|
|
|
|
|
I would say that this is just a bad way to code. Why in the first place do you need to use a file that spans over to 3GB and then you say you want to go to a file size that spans over to 32GB+ or 100GB. Why, why do you want to do that to .NET framework?
I would like to recommend that you convert the file to small chunks of bytes. 1GB can be converted to 10 chunks of 100 MB each. This would allow you to work with all of them in a pretty much quick way.
The problem is that .NET framework supports objects of 2GB, your files of that much size needs to come to memory and .NET wastes most of the cycles just to keep the content of your files in the RAM. That is why, when the file size increases the process takes longer, because it has to make sure the resources are kept in memory and the managed framework has control of the resources also. As suggested, keep the chunk size to 100MB at max and then work on them separately.
The sh*t I complain about
It's like there ain't a cloud in the sky and it's raining out - Eminem
~! Firewall !~
|
|
|
|
|
Afzaal Ahmad Zeeshan wrote: Why, why do you want to do that to .NET framework?
Well, because I have a lot of data. See further down the thread for explanations, but I could be working with up to 1TB data - that split into 100MB files isn't going to be ideal. NTFS allegedly supports files into Exabytes, so why not?
I'm aware of the unsatisfactory 2GB limit on arrays and such even on x64, but I'm using streams here and at no point am I bringing the whole file into memory, in fact, I'm only ever looking at a 1MB slither of it in the example code.
Unusual yes, unrealistic, well I'd argue not.
Regards,
Rob Philpott.
|
|
|
|
|