LAVA Forums Buy cool LAVA gear Forums RSS Feed

Welcome Guest ( Log In | Register )

> Related links

Visit the LabVIEW Wiki Application Design & Architecture Portal


Tags
(This content has not been tagged yet)
3 Pages V   1 2 3 >  
Reply to this topic Start new topic
> What can kill a queue?
jlokanis
post Aug 29 2008, 08:04 PM
Post #1


Very Active
***

Member
Posts: 163
Joined: 14-June 05
From: Seattle, WA
Member No.: 2411
Using LabVIEW Since:1993
LV:8.5.1 ,8.20 ,7.1.1
United States Nothing Selected Nothing Selected


I have run into a very strange problem. I am getting sporatic occurances of an error with one of my queues. Here is the error:

Error 1122 occurred at Dequeue Element in Process GUI Events.vi:34->Engine 422.vi

Possible reason(s):

LabVIEW: Refnum became invalid while node waited for it.

The wierd thing is, as far as I know, this can ONLY happen if the queue is destroyed in some parallel process while this VI is waiting for an element to be enqueued. But, I have searched all the VIs and the only one where the queue is destroyed is in the cleanup VI that comes after this VI and is connected by the error wire. So, there is no way that cleanup VI could execute before the VI that is waiting.

I have a sneaking suspicion that there are some latent bugs in the queue feature. I have a large number of reentrant VIs running and I create a lot of unnamed queues that I pass inside a cluster to sub VIs. So, there are many many instances of this queue (all unique, supposedly) that exist within each tree of reentrant VIs. I thought labVIEW used a GUID to name unnamed queues so they could never step on each other, but maybe because I have so many, the 'name' is getting reused?

Any other ideas? I am at a total loss.

thanks,

-John



--------------------
---------
You mean you still use a keyboard to write your code? How quaint...


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
Ad
post Aug 29 2008, 08:04 PM
Post #















Tags
(This content has not been tagged yet)
Go to the top of the page
Quote Post
neB
post Aug 29 2008, 08:32 PM
Post #2


Certified Kool-Aid Kid
*****

Premium Member
Posts: 1156
Joined: 6-December 02
From: Pittsburgh PA USA
Member No.: 29
Using LabVIEW Since:1998
LV:7.1 ,. ,.
United States Germany Nothing Selected


QUOTE (jlokanis @ Aug 29 2008, 04:04 PM) *
I have run into a very strange problem. I am getting sporatic occurances of an error with one of my queues. Here is the error:

Error 1122 occurred at Dequeue Element in Process GUI Events.vi:34->Engine 422.vi

Possible reason(s):

LabVIEW: Refnum became invalid while node waited for it.

The wierd thing is, as far as I know, this can ONLY happen if the queue is destroyed in some parallel process while this VI is waiting for an element to be enqueued. But, I have searched all the VIs and the only one where the queue is destroyed is in the cleanup VI that comes after this VI and is connected by the error wire. So, there is no way that cleanup VI could execute before the VI that is waiting.

I have a sneaking suspicion that there are some latent bugs in the queue feature. I have a large number of reentrant VIs running and I create a lot of unnamed queues that I pass inside a cluster to sub VIs. So, there are many many instances of this queue (all unique, supposedly) that exist within each tree of reentrant VIs. I thought labVIEW used a GUID to name unnamed queues so they could never step on each other, but maybe because I have so many, the 'name' is getting reused?

Any other ideas? I am at a total loss.

thanks,

-John

Hi John,

I'm not sure if the following may be what is hitting you but you did ask for "other ideas"

When a VI is no longer running, all of the resource tht were allocated by that VI are destroyed. That includes queues. So if the Queue was created in a VI that goes idle the queues it created are destroyed. The work-around it to make sure the VI's that creaed the queue don't go idle until after the queues is destroyed.

Ben


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
jlokanis
post Aug 29 2008, 08:56 PM
Post #3


Very Active
***

Member
Posts: 163
Joined: 14-June 05
From: Seattle, WA
Member No.: 2411
Using LabVIEW Since:1993
LV:8.5.1 ,8.20 ,7.1.1
United States Nothing Selected Nothing Selected


Thanks for the reply. That is definitely a way to cause a queue to be deallocated. In my case, however, I don't think that is possible. The structure of my code has a main vi that calls a sub VI to create the queue and then passes the queue ref to another sub VI that listens to the queue. When the listener quits, it passes its error cluster to the sub VI that destroys the queue. Since all of these VIs are part of the main VI, i don't see how it is possible that the queue reference would be automatically removed from memory. The VI that get the error is running as a sub VI of the same VI that called the VI that created the queue.

The interesting thing is everything seems to work well for a long time and then it all goes to heck. As you can see from the error, the 'main.vi' has been spawned from a template 422 times and the reentrant subVI that got the error is one of 34 in memory right now, all listening to their own 'version' of this queue.

I think the LabVIEW engine get 'confused' and screws this up. I can see many examples of this happening in various parts of my code where queues either become invalid while waiting or are invalid when passed to a subVI, even though a release was never called and their creator VI is still in memory and supposedly 'reserved for run' still...

Perhaps there is some issue with all these VIs being reentrant? I only use the shared clones mode, but none of them have a uninitialized shift register...


--------------------
---------
You mean you still use a keyboard to write your code? How quaint...


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
PJM_labview
post Aug 29 2008, 09:56 PM
Post #4


Extremely Active
****

JKI
Posts: 612
Joined: 19-June 03
From: Bay Area, CA (USA)
Member No.: 121
Using LabVIEW Since:1998
LV:8.5.1 ,8.6 ,8.2.1
United States France Nothing Selected My Blog


I will say that Ben is probably right.

This has happened to me countless time, and every single time this was a lifetime issue.

Scenario Example:

Create Queue in a VI
Put Queue refnum in LV2 Gbl
Launch (asynchronously) other code that need the Queue (other code call LV2 Gbl)
Create Queue VI stops --> Queue refnum become invalid because LabVIEW garbage collect it.
--> Get error in you asynchronous code

PJM

--------------------

Got VIPM?

JKI . VIPM . EasyXML . OpenG . LAVA . Builder . Blog



Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
jlokanis
post Aug 29 2008, 10:48 PM
Post #5


Very Active
***

Member
Posts: 163
Joined: 14-June 05
From: Seattle, WA
Member No.: 2411
Using LabVIEW Since:1993
LV:8.5.1 ,8.20 ,7.1.1
United States Nothing Selected Nothing Selected


QUOTE (PJM_labview @ Aug 29 2008, 02:56 PM) *
Create Queue in a VI
Put Queue refnum in LV2 Gbl
Launch (asynchronously) other code that need the Queue (other code call LV2 Gbl)
Create Queue VI stops --> Queue refnum become invalid because LabVIEW garbage collect it.
--> Get error in you asynchronous code


Yes, if I was creating the queue that way, it would be a problem. But I am not.
This is a extremely simplified version of my code. Each of these functions is actually buried in several sub-VIs. And all of them are reentrant (shared clones). The top VI (this VI) is a template that gets spawned many times. Also, I have over 15 queues, not just the one shown here. There are no functional globals or dynamic calls to anything that creates the queue refs. Everything is tied together by wires, just as you see it here.

Attached Image
The Dequeue element gets an error stating the reference has become invalid while waiting. How could this happen??

--------------------
---------
You mean you still use a keyboard to write your code? How quaint...


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
jdunham
post Aug 29 2008, 11:07 PM
Post #6


Very Active
***

Member
Posts: 200
Joined: 6-March 05
From: Mountain View, CA
Member No.: 1764
Using LabVIEW Since:1994
LV:8.5 ,. ,.
United States us_california Nothing Selected


QUOTE (jlokanis @ Aug 29 2008, 01:56 PM) *
Thanks for the reply. That is definitely a way to cause a queue to be deallocated. In my case, however, I don't think that is possible. The structure of my code has a main vi that calls a sub VI to create the queue and then passes the queue ref to another sub VI that listens to the queue. When the listener quits, it passes its error cluster to the sub VI that destroys the queue. Since all of these VIs are part of the main VI, i don't see how it is possible that the queue reference would be automatically removed from memory. The VI that get the error is running as a sub VI of the same VI that called the VI that created the queue.

I think the LabVIEW engine get 'confused' and screws this up. I can see many examples of this happening in various parts of my code where queues either become invalid while waiting or are invalid when passed to a subVI, even though a release was never called and their creator VI is still in memory and supposedly 'reserved for run' still...

Perhaps there is some issue with all these VIs being reentrant? I only use the shared clones mode, but none of them have a uninitialized shift register...


John:

We also use plenty of queues and a smattering of reentrant VIs. We get error 1122 all the time, because we kill the queues on purpose to stop our processes, but it never happens unexpectedly.
Are you using the "destroy" input for the Close Queue function? You should not need to destroy the queues, just close all the references you open. (there are good times to set destroy=True, but don't just set it because you feel like it).

If not, then I would try to set a breakpoint immediately after the enqueue or dequeue function which is throwing that error and then poke around to see which of your parallel Vis is still running.

Good luck.


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
Eugen Graf
post Aug 29 2008, 11:16 PM
Post #7


Extremely Active
****

Member
Posts: 361
Joined: 6-February 07
From: Homburg/Germany
Member No.: 7689
Using LabVIEW Since:2004
LV:8.5 ,8.0.1 ,.
Germany Kazakhstan Russia My Gallery


QUOTE (jdunham @ Aug 30 2008, 01:07 AM) *
We also use plenty of queues and a smattering of reentrant VIs. We get error 1122 all the time, because we kill the queues on purpose to stop our processes, but it never happens unexpectedly.


thumbdown.gif

Violence is a bad thing.

This post has been edited by Eugen Graf: Aug 29 2008, 11:17 PM

--------------------


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
Aristos Queue
post Aug 29 2008, 11:18 PM
Post #8


LV R&D Envoy
*****

NI
Posts: 1226
Joined: 15-August 06
From: Austin, TX
Member No.: 5877
Using LabVIEW Since:2000
LV:8.5.1 ,. ,.
United States Nothing Selected Nothing Selected My Gallery


We don't use a true GUID. We use a fixed count for the first several bits and a random value for the last few. In order to get any recycling of the unnamed queue IDs you would not only have to generate roughly 30 million queues, you would also need to get particularly (un)lucky on the other bits. That seems unlikely.

--------------------
"A VI outside a class is a gun without a safety. Data outside a class is a target."
--- A message from LabVOOP R&D


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
jlokanis
post Aug 29 2008, 11:25 PM
Post #9


Very Active
***

Member
Posts: 163
Joined: 14-June 05
From: Seattle, WA
Member No.: 2411
Using LabVIEW Since:1993
LV:8.5.1 ,8.20 ,7.1.1
United States Nothing Selected Nothing Selected


QUOTE (jdunham @ Aug 29 2008, 04:07 PM) *
Are you using the "destroy" input for the Close Queue function? You should not need to destroy the queues, just close all the references you open. (there are good times to set destroy=True, but don't just set it because you feel like it).

If not, then I would try to set a breakpoint immediately after the enqueue or dequeue function which is throwing that error and then poke around to see which of your parallel Vis is still running.


I am only creating the queue in one place and destroying it in another. I do not 'obtain' an existing quene anywhere because I am using unnamed queues. I just pass the queue reference to the VIs that need it.
I do use force destroy, however. Maybe I should stop doing that, even though in this case it should not matter.

QUOTE (Aristos Queue @ Aug 29 2008, 04:18 PM) *
We don't use a true GUID. We use a fixed count for the first several bits and a random value for the last few. In order to get any recycling of the unnamed queue IDs you would not only have to generate roughly 30 million queues, you would also need to get particularly (un)lucky on the other bits. That seems unlikely.



What about memory corruption? I notice that when this problem occurs, the whole app also starts to slow down AND memory usage starts to increase.

BTW: This problem only happens in the EXE deployed to a target machine and only after running for several days. So, I really have no way to debug it with breakpoints or anything. At least I log the errors to the event handler...

--------------------
---------
You mean you still use a keyboard to write your code? How quaint...


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
Aristos Queue
post Aug 30 2008, 03:25 AM
Post #10


LV R&D Envoy
*****

NI
Posts: 1226
Joined: 15-August 06
From: Austin, TX
Member No.: 5877
Using LabVIEW Since:2000
LV:8.5.1 ,. ,.
United States Nothing Selected Nothing Selected My Gallery


QUOTE (jlokanis @ Aug 29 2008, 06:25 PM) *
What about memory corruption? I notice that when this problem occurs, the whole app also starts to slow down AND memory usage starts to increase.
That was the other thing I was going to say ... not only would you have to allocate millions of queues, you'd have to have them all continuously in play in order for the refnums to ever hit up against each other. Now if somewhere you're calling Obtain Queue and you're not calling Release Queue, you might be running your machine out of memory, and perhaps something strange is going on there (though I still can't imagine what would just cause the refnum to get deallocated).

--------------------
"A VI outside a class is a gun without a safety. Data outside a class is a target."
--- A message from LabVOOP R&D


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
jlokanis
post Aug 31 2008, 10:37 PM
Post #11


Very Active
***

Member
Posts: 163
Joined: 14-June 05
From: Seattle, WA
Member No.: 2411
Using LabVIEW Since:1993
LV:8.5.1 ,8.20 ,7.1.1
United States Nothing Selected Nothing Selected


QUOTE (Aristos Queue @ Aug 29 2008, 08:25 PM) *
That was the other thing I was going to say ... not only would you have to allocate millions of queues, you'd have to have them all continuously in play in order for the refnums to ever hit up against each other. Now if somewhere you're calling Obtain Queue and you're not calling Release Queue, you might be running your machine out of memory, and perhaps something strange is going on there (though I still can't imagine what would just cause the refnum to get deallocated).


Well, I do not allocate millions of queues, but I do allocate 1000's over the course of running the app.

My concern is this: I spawn many instances of a VIT to run a set of tests on a product. Each of these instances is composed completely of reentrant VIs (so they will not block each other). These reentrant VIs are all of the 'shared clone' type. They create the unnamed queues and pass them around to move data between parallel portions of the program. The only code in the entire app that can kill these queues is in the cleanup VI that is forced by dataflow to be the last thing executed by this spawned (from the VIT) vi.
Now, the launcher that spawns these VITs sets the spawned VI to Autoclose reference. So, the launcher is not responsible for dealing with this reference. When the spawned VI finishes execution, it will leave memory, as will all of its queues, notifiers, etc.
So what is confusing to me is if each spawned VIT creates its own queues (in sub VIs) and then listens to the queues (in other sub VIs), and the only code that can destroy those queue refs is also in a sub-VI of that VIT that is forced to execute last by dataflow, how could I ever get the error "Refnum became invalid while node waited for it.". Even if the VIT was stopped by an external VI, this error would never happen and the code that logs the error to the event log would also not execute. So, something is stepping on my queue refs. If it was memory corruption, then what could be causing it? When I see this, my app is using about 100MB. The machine has 4GB of RAM and no other apps are running.

I suspect that the 'shared clone' reentrant mode and queue refs have some latent bug.




--------------------
---------
You mean you still use a keyboard to write your code? How quaint...


Tags
(This content has not been tagged yet)
Go to the top of the page
+Quote Post
Aristos Queue
post Sep 1 2008, 05:52 AM
Post #12


LV R&D Envoy
*****

NI
Posts: 1226
Joined: 15-August 06
From: Austin, TX
Member No.: 5877
Using LabVIEW Since:2000
LV:8.5.1 ,. ,.
United States Nothing Selected Nothing Selected My Gallery


QUOTE (jlokanis @ Aug 31 2008, 05:37 PM) *
I suspect that the 'shared clone' reentrant mode and queue refs have some latent bug.
Without someone actually inspecting the code, there's no further recommendations that I can make, but those are two independent subsystems, so I would be very doubtful of a bug caused by their interaction. I don't rule out the possibility of a bug in something, but not that.

Can you post your code on ni.com for an AE to investigate? That's going to be the best way to get NI to push further on this.

PS: Even if your app is thousands of VIs, if you're able to share it with the AEs, they'll try to replicate the bug. There's an assumption that customers have to get their architectures down small before a bug will get investigated. But if you're convinced there's a bug in LabVIEW and only a huge application replicates it, then, please, submit the whole app if you can.

--------------------
"A VI outside a class is a gun without a safety. Data outside a class is a target."
--- A message from LabVOOP R&D


Tags
(This content has not been tagged yet)
Go to the top of the page