Design and Architecture

jschell13-Aug-13 8:55

13-Aug-13 8:55

AeroClassics wrote:
All thoughts and suggestions are welcomed!

Attempting to generalize from one case based on hypotheticals is seldom a good idea. The outcome is often code that is never used, overly complex, more fragile (due to complexity) and can even result in more cost when a real case arrives which is totally incompatible.

AeroClassics14-Aug-13 5:12

AeroClassics

14-Aug-13 5:12

JSchell,
I see your point. Solving a specific type of problem this way can lead to overly complex code on some ocassions. However, this is really a specific problem that I tried to explain in a way that required the fewest words in an effort to avoid confusion.

This problem I have solved a couple of different ways and I am faced with it again. I was not completely happy with my other solutions. Having spent the majority of my career working in the Unix/Linux world these things are done differently. While I have written a lot of desktop code in the MS environment I find that I am trying to apply a Unix mindset to Windows desktop problems. I am not sure this is the best way!

So the problem remains. Regardless of where the data stream originates (pipe, TCP/IP etc) waiting for a connection and spinning off a thread to handle that communication channel is the easy part. What I find a bit more difficult to find a decent generic solution for. I am beginning to lean toward using an abstract class or an interface to put the burden on the user of the server object. This is just passing the buck so to speak. Unfortunately I also have to use this object!

This lead to the original question. What do most folks do when they need to move data from one object to another? I realize that you can just invoke a method in another object if you have a reference but that, in my personal opinion, is wrong. The server object should not know about the consumer of the data he should just make it available. Typically I toss the data into a queue and invoke an exposed event. But perhaps there is a better way?

Doug

I am a Traveler
of both Time and Space

jschell15-Aug-13 8:55

15-Aug-13 8:55

AeroClassics wrote:
I realize that you can just invoke a method in another object if you have a
reference but that, in my personal opinion, is wrong.

As a general statement - your conclusion is specifically wrong. Basically it condemns OO entirely as well as ignoring the historical concept of RPC and the problems with that when people decided that fine grained objects were the only way to go. (Early adopters of Java RMI experienced the same problem and that includes earlier JEE containers.)

AeroClassics wrote:
But perhaps there is a better way?

If I have a specific architecture or business need for a message queuing system then I use one. I don't use it just as a whim however because of the complexity involved.

Eddy Vluggen14-Aug-13 6:03

Eddy Vluggen

14-Aug-13 6:03

AeroClassics wrote:
That's fine if all the clients data ends up in one place for processing. But what if an object needs to be paired with a comm handler object?

Create a dictonary, pair the socket with the object that's about to handle the data Smile | :)

Bastard Programmer from Hell Suspicious | :suss:

If you can't read my code, try converting it here[^]

How to store huge binary files without Database

ExcellentOrg21-Aug-13 22:04

ExcellentOrg

21-Aug-13 22:04

Perhaps we are on same quest

I just posted this query. Please click here

Although I may not have as much UNIX background as you do, I do approach problems in same manner of compartmentalization and what I am essentially attempting to do is implement an UNIX tee.

Mercurius846-Aug-13 21:50

Re: How to store huge binary files without Database

6-Aug-13 21:50

Storing huge files without database

Hi,
I'm at my wits end. I just need some idea/brain storming here with the expert/professional.

Scenario,
I have 30 text files. (each files about 300mb-500mb)
What i need to do is to convert these files into some sort of binary and store it some where.
But not in SQL database.

I have an intention to store these files into 'look alike' container.

for example.
I have container A and container B
every each container has a cap size of 1GB.
each and every text files will move into this container (until the quota reach)
Once reach, it will move to container B, and so on ...to C, D, E....
On top of that, I will have an application to locate back these files too.

Any medium/container that i can use for this purpose?

Thanks

modified 7-Aug-13 4:04am.

Pete O'Hanlon6-Aug-13 22:09

Pete O'Hanlon

6-Aug-13 22:09

It wouldn't be hard for you to write one. I can't think of anything that fulfills this particular feature set out of the box, but what you have asked for isn't that complicated. Effectively, you'd just create a set of arrays and fill the arrays. Obviously, you couldn't hold all these arrays in memory at once, but it's easy enough for you to fill one, discard it before moving onto the next.

A couple of thoughts - because we don't know what platform you are going to be running this on, we can't get much more specific. If, however, you are going to be running it on a Vista or later operating system, take a look at the Kernel Transaction Manager as that will help you protect the integrity of the files as you write them out because you can use transactions to support your file write.

Oh, and whatever you do, make sure that the structures you save the files to get backed up regularly.

Chill _Maxxx_

CodeStash - Online Snippet Management | My blog | MoXAML PowerToys | Mole 2010 - debugging made easier

Re: How to store huge binary files without Database

Mercurius846-Aug-13 22:22

Re: How to store huge binary files without Database

6-Aug-13 22:22

Hi Thanks Chill for the reply.

It's undecided yet. either using .Net or Java.
Dependent on complexity and ease of the job.

I will pick up some info on Kernel Transaction Manager.
However, does this KTM works fine with >600 GB to Terabyte files?
Any issue or constraint it might have?

Should there be other recommendation?
I'm afraid my management may decide to reside the application on UNIX or other platform than Windows.
Then I will be in trouble of revamping the core program.

modified 7-Aug-13 4:34am.

jschell7-Aug-13 8:22

Re: How to store huge binary files without Database

7-Aug-13 8:22

Mercurius84 wrote:
I have 30 text files. (each files about 300mb-500mb)
What i need to do is to
convert these files into some sort of binary and store it some where.

Err....

File system already stores binary files.
File system already has a hierarchy.
File system is not a database.

Any solution, including a database, uses the file system for storage.

So exactly what is the problem?

Mercurius847-Aug-13 17:43

Re: How to store huge binary files without Database

7-Aug-13 17:43

Hi,
I just want to compress the files and package them into one container for a configurable size.

Any idea of doing the packaging?(not zipping)

Richard MacCutchan7-Aug-13 20:50

Richard MacCutchan

7-Aug-13 20:50

Mercurius84 wrote:
I just want to compress the files and package them into one container for a configurable size.

Mercurius84 wrote:
Any idea of doing the packaging?(not zipping)

How is "compressing and packaging" not zipping?

Use the best guess

Re: How to store huge binary files without Database

Mercurius847-Aug-13 22:11

Re: How to store huge binary files without Database

7-Aug-13 22:11

I have found the solution by this product:

Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.

Thanks Smile | :)

jschell8-Aug-13 7:52

Re: How to store huge binary files without Database

8-Aug-13 7:52

Mercurius84 wrote:
Hadoop Distributed File System (HDFS™): A distributed file system that provides
high-throughput access to application data.

Based only what you described as your needs this is overkill.

Mercurius8411-Aug-13 16:55

Re: How to store huge binary files without Database

11-Aug-13 16:55

Hi,

What do you mean by overkill?
This HDFS has its limitation and does not locate the processing logic power?

I have no personal experience to this product
As per summarized:
Hadoop is an Apache Software Foundation distributed file system and data management project with goals for storing and managing large amounts of data. Hadoop uses a storage system called HDFS to connect commodity personal computers, known as nodes, contained within clusters over which data blocks are distributed. You can access and store the data blocks as one seamless file system using the MapReduce processing model.
HDFS shares many common features with other distributed file systems while supporting some important differences. One significant difference is HDFS's write-once-read-many model that relaxes concurrency control requirements, simplifies data coherency, and enables high-throughput access.
In order to provide an optimized data-access model, HDFS is designed to locate processing logic near the data rather than locating data near the application space.

It sounds promising.

jschell12-Aug-13 8:11

Re: How to store huge binary files without Database

12-Aug-13 8:11

Mercurius84 wrote:
What do you mean by overkill...Hadoop is an Apache Software
Foundation distributed file system and data management project with goals for
storing and managing large amounts of data.

Your stated requirements do not meet the definition of "large amounts of data".

Let me give you some examples of large data
- 2000 transactions a second sustained with a expected lifetime of 7 years and a real time need of 6 to 18 months immediate availability. Each transaction has a 1k size.
- Each originator will produce several 100 meg downloads several times a month. Sizing must expect up to 10,000 originators with a lifetime of 5 years.

Mercurius8412-Aug-13 21:41

Re: How to store huge binary files without Database

12-Aug-13 21:41

I get what you mean : )

Richard MacCutchan12-Aug-13 22:50

Richard MacCutchan

12-Aug-13 22:50

Why not use http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive.aspx[^]?

Use the best guess

Re: How to store huge binary files without Database

Keld Ølykke9-Aug-13 22:40

Keld Ølykke

9-Aug-13 22:40

Hi Merc,

It just hit me when reading this post that your problem is similar to to the problem of splitting a large file into chunks. As an example if you have big archieve on disk and want to store it on removable devices e.g. floppy, cdrom or dvd.
In the old days we used ARJ to split a compressed file into a number of volumes (.arj, .a01, .a02, etc.). Each volume had a fixed maximum size e.g. 1.44 MB for a floppy.

A limitation was that you could not add/remove stuff from .a02 without breaking the big file, so if you need to do this, splitting big files into compressed volumes might not be your solution.

If you want to play with this approach you can use rar. According to http://acritum.com/software/manuals/winrar/html/helparcvolumes.htm[^] this feature is called multivolume archieves.

Kind Regards,

Keld Ølykke

Re: How to store huge binary files without Database

Mercurius8411-Aug-13 17:08

Re: How to store huge binary files without Database

11-Aug-13 17:08

Many thanks.
But this process only works at the end of the document status.
I do not think it would be able to append more files after been compressed and segmented the files

I have a requirement too, which additional files can be added to after the above compression and segmentation.

Keld Ølykke11-Aug-13 19:43

Keld Ølykke

11-Aug-13 19:43

Normally, I don't think further modification of the multi-volume is possible, but I don't have the insight to tell you so. There might be one archive tool that can do it.

Kind Regards,

Keld Ølykke

Re: How to store huge binary files without Database

jschell10-Aug-13 11:13

Re: How to store huge binary files without Database

10-Aug-13 11:13

Mercurius84 wrote:
I just want to compress the files and package them into one container for a configurable size.

Windows has compressed drives. Pretty sure every major OS does as well.

However your requirements still don't meet that need. Again your requirements don't require a specialized system. Current file systems are more than capable of handling that trivial amount of data.

Mercurius8411-Aug-13 17:09

Re: How to store huge binary files without Database

11-Aug-13 17:09

Yes,
However, we need to further shrinking down the file size and better file handling.
Instead of OS handling.

jschell12-Aug-13 7:58

Re: How to store huge binary files without Database

12-Aug-13 7:58

Mercurius84 wrote:
However, we need to further shrinking down the file size and better file
handling.

And you base that on what exactly?
What is your criteria? What is your desired improvement? How did you measure the criteria using the file system.

How do you escape the fact that any such solution will STILL rely on the file system?

Mercurius84 wrote:
Instead of OS handling.

The OS has been optimized to handle files given that OS file systems are a key component of desktop OSes.

Mercurius8412-Aug-13 21:48

Re: How to store huge binary files without Database

12-Aug-13 21:48

I had an assumption that programming could do 'almost' wonderful things.

for example:
I have a file sized 10 MB.
This special program shrinks down the size by ~30% (7MB for example)
and then store it in to a container.

What i mean by OS handling is unlike a normal compression and store them into a folder.

jschell13-Aug-13 8:42